With the development of genome sequencing more whole genomes of microorganisms were completed, many methods were introduced to reconstruct the phylogenetic tree of those microorganisms with the information extracted from the whole genomes through various ways of transforming or mapping the whole genome sequences into other forms which can describe the evolutionary distance in a new way. We think it might be possible that there exists information buried in the whole genome transferred along lineage, which remains stable and is more essential than sequence conservation of individual genes or the arrangement of some genes of a selected set. We need to find one measurement that can involve as many phylogenetic features as possible that are beyond the genome sequence itself. We converted each genome sequence of the microorganisms into another linear sequence to represent the functional structure of the sequence, and we used a new information function to calculate the discrepancy of sequences and to get one distance matrix of the genomes, and built one phylogenetic tree with a neighbor joining method. The resulting tree shows that the major lineages are consistent with the result based on their 16srRNA sequences. Our method discovered one phylogenetic feature derived from the genome sequences and the encoded genes that can rebuild the phylogenetic tree correctly. The mapping of one genome sequence to its new form representing the relative positions of the functional genes provides a new way to measure the phylogenetic relationships, and with the more specific classification of gene functions the result could be more sensitive.
Bibliographical noteFunding Information:
This work is supported by Chinese National Scientific Foundation in fund-numbers of 39890070, 19890380, and 39993420.
- Information theory
- Phylogenetic trees
- Whole genome analysis