Please use this identifier to cite or link to this item: https://hdl.handle.net/10316/108631
Title: A Machine Learning Approach for Hot-Spot Detection at Protein-Protein Interfaces
Authors: Melo, Rita 
Fieldhouse, Robert
Melo, André 
Correia, João D. G.
Cordeiro, Maria Natália D. S.
Gümüş, Zeynep H.
Costa, Joaquim 
Bonvin, Alexandre M. J. J. 
Moreira, Irina S. 
Keywords: protein-protein interfaces; hot-spots; machine learning; Solvent Accessible Surface Area (SASA); evolutionary sequence conservation
Issue Date: 27-Jul-2016
Publisher: MDPI
Project: SFRH/BPD/97650/2013 
UID/Multi/04349/2013 
info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UID/NEU/04539/2013/PT 
metadata.degois.publication.title: International Journal of Molecular Sciences
metadata.degois.publication.volume: 17
metadata.degois.publication.issue: 8
Abstract: Understanding protein-protein interactions is a key challenge in biochemistry. In this work, we describe a more accurate methodology to predict Hot-Spots (HS) in protein-protein interfaces from their native complex structure compared to previous published Machine Learning (ML) techniques. Our model is trained on a large number of complexes and on a significantly larger number of different structural- and evolutionary sequence-based features. In particular, we added interface size, type of interaction between residues at the interface of the complex, number of different types of residues at the interface and the Position-Specific Scoring Matrix (PSSM), for a total of 79 features. We used twenty-seven algorithms from a simple linear-based function to support-vector machine models with different cost functions. The best model was achieved by the use of the conditional inference random forest (c-forest) algorithm with a dataset pre-processed by the normalization of features and with up-sampling of the minor class. The method has an overall accuracy of 0.80, an F1-score of 0.73, a sensitivity of 0.76 and a specificity of 0.82 for the independent test set.
URI: https://hdl.handle.net/10316/108631
ISSN: 1422-0067
DOI: 10.3390/ijms17081215
Rights: openAccess
Appears in Collections:I&D CNC - Artigos em Revistas Internacionais

Show full item record

Page view(s)

77
checked on Nov 5, 2024

Download(s)

16
checked on Nov 5, 2024

Google ScholarTM

Check

Altmetric

Altmetric


This item is licensed under a Creative Commons License Creative Commons