Alpha-helical transmembrane proteins mediate many key biological processes and represent 20%-30% of all genes in many organisms. Due to the difficulties in experimentally determining their high-resolution 3D structure, computational methods to predict the location and orientation of transmembrane helix segments using sequence information are essential. We present, TOPTMH a new transmembrane helix topology prediction method that combines support vector machines, hidden Markov models, and a widely-used rule-based scheme. The contribution of this work is the development of a prediction approach that first uses a binary SVM classifier to predict the helix residues and then it employs a pair of HMM models that incorporate the SVM predictions and hydropathy-based features to identify the entire transmembrane helix segments by capturing the structural characteristics of these proteins. TOPTMH outperforms state-of-the-art prediction methods and achieves the best performance on an independent static benchmark.
|Original language||English (US)|
|Title of host publication||Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2008, Proceedings|
|Number of pages||16|
|State||Published - 2008|
|Event||European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2008 - Antwerp, Belgium|
Duration: Sep 15 2008 → Sep 19 2008
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Other||European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2008|
|Period||9/15/08 → 9/19/08|
Bibliographical noteFunding Information:
This work was supported by NSF EIA-9986042, ACI-0133464, IIS-0431135, NIH RLM008713A, NIH T32GM008347, the Digital Technology Center, University of Minnesota and the Minnesota Supercomputing Institute.