Classification of proteins based on minimal modular repeats: Lessons from nature in protein design

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Proteins containing internal repeats within their primary sequence have received increased attention recently, as the extent of their presence in various organisms is recognized more fully, and their role in evolution is more thoroughly studied. Presented here is a technique used to detect and classify proteins based on a modular evolutionary phenomenon that results in a series of small internal repeats. The parameters chosen are based on a minimum segment of seven residues that result in simple functional scaffolds. The genomes and corresponding proteomes of a variety of eubacteria and archaea have been analyzed using an algorithm that searches prokaryotic genomes for proteins containing small conserved repeats assembled in a modular fashion similar to a recently characterized protein from the organism Nitrosomonas europaea. This analysis has revealed additional proteins present in N. europaea with similar modular characteristics. A further survey of a variety of organisms demonstrates that this evolutionary pathway has been utilized in other organisms as well, to yield a broad assortment of small modular proteins. A thorough description of the sequential characteristics of these modular proteins follows, along with a selection and discussion of the various proteins uncovered through this expanded search and analysis. Several databases of the proteins uncovered from this work and the program used to perform the search are available.

Original languageEnglish (US)
Pages (from-to)473-482
Number of pages10
JournalJournal of Proteome Research
Issue number3
StatePublished - Mar 1 2006


  • Algorithm
  • Database
  • Internal protein repeats
  • Modular design
  • Small proteins


Dive into the research topics of 'Classification of proteins based on minimal modular repeats: Lessons from nature in protein design'. Together they form a unique fingerprint.

Cite this