Two molecular similarity methods have been used to select nearest neighbors from four different sets of chemicals. One of the methods is based on the Euclidean distance of chemicals in the ten dimensional principal components space derived from 97 graph invariants. The second approach is based on the count of atom pairs common to a pair of molecules. Two probe chemicals were selected, and neighbors of each were determined by the two methods for the following four sets of molecules: (a) a combined set of octane and nonane isomers, (b) a relatively more diverse set of 382 chemicals, (c) a diverse set of 3692 chemicals, and (d) the STARLIST data base of log P consisting of 4067 structures. The results show that the measures reflect an intuitive notion of chemical similarity.
|Original language||English (US)|
|Number of pages||7|
|Journal||Journal of chemical information and computer sciences|
|State||Published - Mar 1 1994|