Characterization of DNA Primary Sequences Based on the Average Distances between Bases

Milan Randić, Subhash C. Basak

Research output: Contribution to journalArticlepeer-review

48 Scopus citations


We outline numerical characterization of DNA primary sequence based on calculation of the average distance between pairs of nucleic acid bases. This leads to a representation of DNA by a condensed 4 × 4 symmetrical matrix, the elements of which give the average separation between pair of bases X, Y in DNA (X, Y = A, C, G, T). As an invariant of choice we consider the leading eigenvalue of the derived 4 × 4 matrix. Additional structurally related invariants were obtained by constructing additional "higher order" 4 × 4 matrices derived from the initial 4 × 4 matrix by raising its elements to higher powers. Suitably normalized leading eigenvalue of these matrices offer a novel characterization of DNA primary sequences, referred to as "DNA profiles". The approach is illustrated on exon 1 of human β-globin gene.

Original languageEnglish (US)
Pages (from-to)561-568
Number of pages8
JournalJournal of chemical information and computer sciences
Issue number3
StatePublished - Dec 1 2001


Dive into the research topics of 'Characterization of DNA Primary Sequences Based on the Average Distances between Bases'. Together they form a unique fingerprint.

Cite this