Characterization of DNA Primary Sequences Based on the Average Distances between Bases

Milan Randić, Subhash C. Basak

Research output: Contribution to journalArticle

45 Scopus citations

Abstract

We outline numerical characterization of DNA primary sequence based on calculation of the average distance between pairs of nucleic acid bases. This leads to a representation of DNA by a condensed 4 × 4 symmetrical matrix, the elements of which give the average separation between pair of bases X, Y in DNA (X, Y = A, C, G, T). As an invariant of choice we consider the leading eigenvalue of the derived 4 × 4 matrix. Additional structurally related invariants were obtained by constructing additional "higher order" 4 × 4 matrices derived from the initial 4 × 4 matrix by raising its elements to higher powers. Suitably normalized leading eigenvalue of these matrices offer a novel characterization of DNA primary sequences, referred to as "DNA profiles". The approach is illustrated on exon 1 of human β-globin gene.

Original languageEnglish (US)
Pages (from-to)561-568
Number of pages8
JournalJournal of chemical information and computer sciences
Volume41
Issue number3
DOIs
StatePublished - Dec 1 2001

Fingerprint Dive into the research topics of 'Characterization of DNA Primary Sequences Based on the Average Distances between Bases'. Together they form a unique fingerprint.

  • Cite this