TY - JOUR
T1 - SplitTester
T2 - Software to identify domains responsible for functional divergence in protein family
AU - Gao, Xiang
AU - Vander Velden, Kent A.
AU - Voytas, Daniel F.
AU - Gu, Xun
PY - 2005/6/1
Y1 - 2005/6/1
N2 - Background: Many protein families have undergone functional divergence after gene duplications such that current subgroups of the family carry out overlapping but distinct biological roles. For the protein families with known functional subtypes (a functional split), we developed the software, SplitTester, to identify potential regions that are responsible for the observed distinct functional subtypes within the same protein family. Results: Our software, SplitTester, takes a multiple protein sequences alignment as input, generated from protein members of two subgroups with known functional divergence. SplitTester was designed to construct the neighbor joining tree (a split cluster) from variable-sized sliding windows across the alignment in a process called split-clustering. SplitTester identifies the regions, whose split cluster is consistent with the functional split, but may be inconsistent with the phylogeny of the protein family. We hypothesize that at least some number of these identified regions, which are not following a random mutation process, are responsible for the observed functional split. To test our method, we used reverse transcriptase from a group of Pseudoviridae retrotransposons: to identify residues specific for diverged primer recognition. Candidate regions were then mapped onto the three dimensional structures of reverse transcriptase. The locations of these amino acids within the enzyme are consistent with their biological roles. Conclusions: SplitTester aims to identify specific domain sequences responsible for functional divergence of subgroups within a protein family. From the analysis of retroelements reverse transcriptase family, we successfully identified the regions splitting this family according to the primer specificity, implying their functions in the specific primer selection.
AB - Background: Many protein families have undergone functional divergence after gene duplications such that current subgroups of the family carry out overlapping but distinct biological roles. For the protein families with known functional subtypes (a functional split), we developed the software, SplitTester, to identify potential regions that are responsible for the observed distinct functional subtypes within the same protein family. Results: Our software, SplitTester, takes a multiple protein sequences alignment as input, generated from protein members of two subgroups with known functional divergence. SplitTester was designed to construct the neighbor joining tree (a split cluster) from variable-sized sliding windows across the alignment in a process called split-clustering. SplitTester identifies the regions, whose split cluster is consistent with the functional split, but may be inconsistent with the phylogeny of the protein family. We hypothesize that at least some number of these identified regions, which are not following a random mutation process, are responsible for the observed functional split. To test our method, we used reverse transcriptase from a group of Pseudoviridae retrotransposons: to identify residues specific for diverged primer recognition. Candidate regions were then mapped onto the three dimensional structures of reverse transcriptase. The locations of these amino acids within the enzyme are consistent with their biological roles. Conclusions: SplitTester aims to identify specific domain sequences responsible for functional divergence of subgroups within a protein family. From the analysis of retroelements reverse transcriptase family, we successfully identified the regions splitting this family according to the primer specificity, implying their functions in the specific primer selection.
UR - http://www.scopus.com/inward/record.url?scp=25444453179&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=25444453179&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-6-137
DO - 10.1186/1471-2105-6-137
M3 - Article
C2 - 15929795
AN - SCOPUS:25444453179
SN - 1471-2105
VL - 6
JO - BMC bioinformatics
JF - BMC bioinformatics
M1 - 137
ER -