Profile-based direct kernels for remote homology detection and fold recognition

Huzefa Rangwala, George Karypis

Research output: Contribution to journalArticlepeer-review

142 Scopus citations


Motivation: Protein remote homology detection is a central problem in computational biology. Supervised learning algorithms based on support vector machines are currently one of the most effective methods for remote homology detection. The performance of these methods depends on how the protein sequences are modeled and on the method used to compute the kernel function between them. Results: We introduce two classes of kernel functions that are constructed by combining sequence profiles with new and existing approaches for determining the similarity between pairs of protein sequences. These kernels are constructed directly from these explicit protein similarity measures and employ effective profile-to-profile scoring schemes for measuring the similarity between pairs of proteins. Experiments with remote homology detection and fold recognition problems show that these kernels are capable of producing results that are substantially better than those produced by all of the existing state-of-the-art SVM-based methods. In addition, the experiments show that these kernels, even when used in the absence of profiles, produce results that are better than those produced by existing non-profile-based schemes.

Original languageEnglish (US)
Pages (from-to)4239-4247
Number of pages9
Issue number23
StatePublished - Dec 2005

Bibliographical note

Funding Information:
This work was supported by NSF EIA-9986042; ACI-9982274, ACI-0133464, ACI-0312828, IIS-0431135, the Army High Performance Computing Research Center contract number DAAD19-01-2-0014, and by the Digital Technology Center at the University of Minnesota. Funding to pay the Open Access publication charges for this article was provided by the National Science Foundation.

Fingerprint Dive into the research topics of 'Profile-based direct kernels for remote homology detection and fold recognition'. Together they form a unique fingerprint.

Cite this