Identification of novel families of membrane proteins from the model plant Arabidopsis thaliana

Research output: Contribution to journalArticle

82 Citations (Scopus)

Abstract

Motivation: The completion of the Arabidopsis genome offers the first opportunity to analyze all of the membrane protein sequences of a plant. The majority of integral membrane proteins including transporters, channels, and pumps contain hydrophobic α-helices and can be selected based on TransMembrane Spanning (TMS) domain prediction. By clustering the predicted membrane proteins based on sequence, it is possible to sort the membrane proteins into families of known function, based on experimental evidence or homology, or unknown function. This provides a way to identify target sequences for future functional analysis. Results: An automated approach was used to select potential membrane protein sequences from the set of all predicted proteins and cluster the sequences into related families. The recently completed sequence of Arabidopsis thaliana, a model plant, was analyzed. Of the 25470 predicted protein sequences 4589 (18%) were identified as containing two or more membrane spanning domains. The membrane protein sequences clustered into 628 distinct families containing 3208 sequences. Of these, 211 families (1764 sequences) either contained proteins of known function or showed homology to proteins of known function in other species. However, 417 families (1444 sequences) contained only sequences with no known function and no homology to proteins of known function. In addition, 1381 sequences did not cluster with any family and no function could be assigned to 1337 of these.

Original languageEnglish (US)
Pages (from-to)560-563
Number of pages4
JournalBioinformatics
Volume17
Issue number6
DOIs
StatePublished - Jan 1 2001

Fingerprint

Arabidopsis Thaliana
Membrane Protein
Arabidopsis
Membrane Proteins
Proteins
Membranes
Protein Sequence
Protein
Homology
Model
Functional analysis
Cluster Analysis
Family
Functional Analysis
Genome
Helix
Genes
Pumps
Sort
Pump

Cite this

Identification of novel families of membrane proteins from the model plant Arabidopsis thaliana. / Ward, J. M.

In: Bioinformatics, Vol. 17, No. 6, 01.01.2001, p. 560-563.

Research output: Contribution to journalArticle

@article{9330505fccaf4da5ae8be8a6feca8352,
title = "Identification of novel families of membrane proteins from the model plant Arabidopsis thaliana",
abstract = "Motivation: The completion of the Arabidopsis genome offers the first opportunity to analyze all of the membrane protein sequences of a plant. The majority of integral membrane proteins including transporters, channels, and pumps contain hydrophobic α-helices and can be selected based on TransMembrane Spanning (TMS) domain prediction. By clustering the predicted membrane proteins based on sequence, it is possible to sort the membrane proteins into families of known function, based on experimental evidence or homology, or unknown function. This provides a way to identify target sequences for future functional analysis. Results: An automated approach was used to select potential membrane protein sequences from the set of all predicted proteins and cluster the sequences into related families. The recently completed sequence of Arabidopsis thaliana, a model plant, was analyzed. Of the 25470 predicted protein sequences 4589 (18{\%}) were identified as containing two or more membrane spanning domains. The membrane protein sequences clustered into 628 distinct families containing 3208 sequences. Of these, 211 families (1764 sequences) either contained proteins of known function or showed homology to proteins of known function in other species. However, 417 families (1444 sequences) contained only sequences with no known function and no homology to proteins of known function. In addition, 1381 sequences did not cluster with any family and no function could be assigned to 1337 of these.",
author = "Ward, {J. M.}",
year = "2001",
month = "1",
day = "1",
doi = "10.1093/bioinformatics/17.6.560",
language = "English (US)",
volume = "17",
pages = "560--563",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "6",

}

TY - JOUR

T1 - Identification of novel families of membrane proteins from the model plant Arabidopsis thaliana

AU - Ward, J. M.

PY - 2001/1/1

Y1 - 2001/1/1

N2 - Motivation: The completion of the Arabidopsis genome offers the first opportunity to analyze all of the membrane protein sequences of a plant. The majority of integral membrane proteins including transporters, channels, and pumps contain hydrophobic α-helices and can be selected based on TransMembrane Spanning (TMS) domain prediction. By clustering the predicted membrane proteins based on sequence, it is possible to sort the membrane proteins into families of known function, based on experimental evidence or homology, or unknown function. This provides a way to identify target sequences for future functional analysis. Results: An automated approach was used to select potential membrane protein sequences from the set of all predicted proteins and cluster the sequences into related families. The recently completed sequence of Arabidopsis thaliana, a model plant, was analyzed. Of the 25470 predicted protein sequences 4589 (18%) were identified as containing two or more membrane spanning domains. The membrane protein sequences clustered into 628 distinct families containing 3208 sequences. Of these, 211 families (1764 sequences) either contained proteins of known function or showed homology to proteins of known function in other species. However, 417 families (1444 sequences) contained only sequences with no known function and no homology to proteins of known function. In addition, 1381 sequences did not cluster with any family and no function could be assigned to 1337 of these.

AB - Motivation: The completion of the Arabidopsis genome offers the first opportunity to analyze all of the membrane protein sequences of a plant. The majority of integral membrane proteins including transporters, channels, and pumps contain hydrophobic α-helices and can be selected based on TransMembrane Spanning (TMS) domain prediction. By clustering the predicted membrane proteins based on sequence, it is possible to sort the membrane proteins into families of known function, based on experimental evidence or homology, or unknown function. This provides a way to identify target sequences for future functional analysis. Results: An automated approach was used to select potential membrane protein sequences from the set of all predicted proteins and cluster the sequences into related families. The recently completed sequence of Arabidopsis thaliana, a model plant, was analyzed. Of the 25470 predicted protein sequences 4589 (18%) were identified as containing two or more membrane spanning domains. The membrane protein sequences clustered into 628 distinct families containing 3208 sequences. Of these, 211 families (1764 sequences) either contained proteins of known function or showed homology to proteins of known function in other species. However, 417 families (1444 sequences) contained only sequences with no known function and no homology to proteins of known function. In addition, 1381 sequences did not cluster with any family and no function could be assigned to 1337 of these.

UR - http://www.scopus.com/inward/record.url?scp=0034954896&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034954896&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/17.6.560

DO - 10.1093/bioinformatics/17.6.560

M3 - Article

C2 - 11395435

AN - SCOPUS:0034954896

VL - 17

SP - 560

EP - 563

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 6

ER -