Similarity, like beauty, is an intuitive concept based on personal perception and bias. In the realm of molecular similarity, each method is user defined based on the features deemed important. A method's efficacy depends on the set of descriptors used to define the intermolecular similarity of chemicals and on the mathematical function used to quantify similarity. Quantitative molecular similarity analysis (QMSA) methods, based on experimental data or computed molecular descriptors, have emerged as powerful tools for analog selection and property estimation. We have carried out a comparative study of similarity spaces derived from atom pairs and a large set of topological indices for two diverse sets of chemicals: (a) a set of 469 chemicals with vapor pressure data from the TSCA inventory, and (b) a set of 213 chemicals with lipophilicity data from the STARLIST inventory. These spaces were used for the KNN-based estimation of properties (K=1-10, 15, 20, 25). The results for the QMSA models developed in this paper are also compared with model estimates derived from hierarchical QSARs.
|Original language||English (US)|
|Number of pages||15|
|Journal||Journal of Molecular Graphics and Modelling|
|State||Published - 2001|
Bibliographical noteFunding Information:
This is contribution number 297 from the Center for Water and the Environment of the Natural Resources Research Institute. Research reported in this paper was supported by grant F49620-98-1-0015 from the United States Air Force. The authors would like to thank Denise Mills and Gregory Grunwald for their continued assistance.
- Atom pairs
- Hierarchical QSAR
- Molecular similarity
- Property estimation
- Structure space
- Topological indices