Abstract
Contact map prediction is of great interest for its application in fold recognition and protein 3D structure determination. In this paper we present a contact-map prediction algorithm that employs Support Vector Machines as the machine learning tool and incorporates various features such as sequence profiles and their conservations, correlated mutation analysis based on various amino acid physicochemical properties, and secondary structure. In addition, we evaluated the effectiveness of the different features on contact map prediction for different fold classes. On average, our predictor achieved a prediction accuracy of 0.224 with an improvement over a random predictor of a factor 11.7, which is better than reported studies. Our study showed that predicted secondary structure features play an important roles for the proteins containing beta-structures. Models based on secondary structure features and correlated mutation analysis features produce different sets of predictions. Our study also suggests that models learned separately for different protein fold families may achieve better performance than a unified model.
Original language | English (US) |
---|---|
Pages (from-to) | 849-865 |
Number of pages | 17 |
Journal | International Journal on Artificial Intelligence Tools |
Volume | 14 |
Issue number | 5 |
DOIs | |
State | Published - Oct 2005 |
Bibliographical note
Funding Information:∗This work was supported by NSF CCR-9972519, EIA-9986042, ACI-9982274, ACI-0133464, by Army Research Office contract DA/DAAG55-98-1-0441, by the DOE ASCI program, and by Army High Performance Computing Research Center contract number DAAH04-95-C-0008. Related papers are available via WWW at URL: http://www.cs.umn.edu/˜karypis
Keywords
- Contact map prediction
- Correlated mutation analysis
- Support vector machines