Molecular similarity-based estimation of properties: A comparison of three structure spaces

Brian D. Gute, Subhash C. Basak

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

Similarity, like beauty, is an intuitive concept based on personal perception and bias. In the realm of molecular similarity, each method is user defined based on the features deemed important. A method's efficacy depends on the set of descriptors used to define the intermolecular similarity of chemicals and on the mathematical function used to quantify similarity. Quantitative molecular similarity analysis (QMSA) methods, based on experimental data or computed molecular descriptors, have emerged as powerful tools for analog selection and property estimation. We have carried out a comparative study of similarity spaces derived from atom pairs and a large set of topological indices for two diverse sets of chemicals: (a) a set of 469 chemicals with vapor pressure data from the TSCA inventory, and (b) a set of 213 chemicals with lipophilicity data from the STARLIST inventory. These spaces were used for the KNN-based estimation of properties (K=1-10, 15, 20, 25). The results for the QMSA models developed in this paper are also compared with model estimates derived from hierarchical QSARs.

Original languageEnglish (US)
Pages (from-to)95-109
Number of pages15
JournalJournal of Molecular Graphics and Modelling
Volume20
Issue number1
DOIs
StatePublished - 2001

Bibliographical note

Funding Information:
This is contribution number 297 from the Center for Water and the Environment of the Natural Resources Research Institute. Research reported in this paper was supported by grant F49620-98-1-0015 from the United States Air Force. The authors would like to thank Denise Mills and Gregory Grunwald for their continued assistance.

Keywords

  • Atom pairs
  • Hierarchical QSAR
  • Molecular similarity
  • Property estimation
  • Structure space
  • Topological indices

Fingerprint

Dive into the research topics of 'Molecular similarity-based estimation of properties: A comparison of three structure spaces'. Together they form a unique fingerprint.

Cite this