Identification of novel families of membrane proteins from the model plant Arabidopsis thaliana

Research output: Contribution to journalArticlepeer-review

89 Scopus citations

Abstract

Motivation: The completion of the Arabidopsis genome offers the first opportunity to analyze all of the membrane protein sequences of a plant. The majority of integral membrane proteins including transporters, channels, and pumps contain hydrophobic α-helices and can be selected based on TransMembrane Spanning (TMS) domain prediction. By clustering the predicted membrane proteins based on sequence, it is possible to sort the membrane proteins into families of known function, based on experimental evidence or homology, or unknown function. This provides a way to identify target sequences for future functional analysis. Results: An automated approach was used to select potential membrane protein sequences from the set of all predicted proteins and cluster the sequences into related families. The recently completed sequence of Arabidopsis thaliana, a model plant, was analyzed. Of the 25470 predicted protein sequences 4589 (18%) were identified as containing two or more membrane spanning domains. The membrane protein sequences clustered into 628 distinct families containing 3208 sequences. Of these, 211 families (1764 sequences) either contained proteins of known function or showed homology to proteins of known function in other species. However, 417 families (1444 sequences) contained only sequences with no known function and no homology to proteins of known function. In addition, 1381 sequences did not cluster with any family and no function could be assigned to 1337 of these.

Original languageEnglish (US)
Pages (from-to)560-563
Number of pages4
JournalBioinformatics
Volume17
Issue number6
DOIs
StatePublished - 2001

Fingerprint

Dive into the research topics of 'Identification of novel families of membrane proteins from the model plant Arabidopsis thaliana'. Together they form a unique fingerprint.

Cite this