Artificial neural network-based analysis of high-throughput screening data for improved prediction of active compounds

Swapan Chakrabarti, Stan R. Svojanovsky, Romana Slavik, Gunda I. Georg, George S. Wilson, Peter G. Smith

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


Artificial neural networks (ANNs) are trained using high-throughput screening (HTS) data to recover active compounds from a large data set. Improved classification performance was obtained on combining predictions made by multiple ANNs. The HTS data, acquired from a methionine aminopeptidases inhibition study, consisted of a library of 43,347 compounds, and the ratio of active to nonactive compounds, RA/N, was 0.0321. Back-propagation ANNs were trained and validated using principal components derived from the physicochemical features of the compounds. On selecting the training parameters carefully, an ANN recovers one-third of all active compounds from the validation set with a 3-fold gain in RA/N value. Further gains in R A/N values were obtained upon combining the predictions made by a number of ANNs. The generalization property of the back-propagation ANNs was used to train those ANNs with the same training samples, after being initialized with different sets of random weights. As a result, only 10% of all available compounds were needed for training and validation, and the rest of the data set was screened with more than a 10-fold gain of the original RA/N value. Thus, ANNs trained with limited HTS data might become useful in recovering active compounds from large data sets.

Original languageEnglish (US)
Pages (from-to)1236-1244
Number of pages9
JournalJournal of Biomolecular Screening
Issue number10
StatePublished - Dec 2009


  • Generalization property
  • Neural networks
  • Pattern classification


Dive into the research topics of 'Artificial neural network-based analysis of high-throughput screening data for improved prediction of active compounds'. Together they form a unique fingerprint.

Cite this