Mathematical structural descriptors and mutagenicity assessment: a study with congeneric and diverse datasets$

S. Majumdar, Subhash C Basak, C. N. Lungu, M. V. Diudea, G. D. Grunwald

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Quantitative bioactivity and toxicity assessment of chemical compounds plays a central role in drug discovery as it saves a substantial amount of resources. To this end, high-performance computing has enabled researchers and practitioners to leverage hundreds, or even thousands, of computed molecular descriptors for the activity prediction of candidate compounds. In this paper, we evaluate the utility of two large groups of chemical descriptors by such predictive modelling, as well as chemical structure discovery, through empirical analysis. We use a suite of commercially available and in-house software to calculate molecular descriptors for two sets of chemical mutagens–a homogeneous set of 95 amines, and a diverse set of 508 chemicals. Using calculated descriptors, we model the mutagenic activity of these compounds using a number of methods from the statistics and machine-learning literature, and use robust principal component analysis to investigate the low-dimensional subspaces that characterize these chemicals. Our results suggest that combining different sets of descriptors is likely to result in a better predictive model–but that depends on the compounds being modelled and the modelling technique being used.

Original languageEnglish (US)
Pages (from-to)579-590
Number of pages12
JournalSAR and QSAR in environmental research
Issue number8
StatePublished - Aug 3 2018


  • dimension reduction
  • machine learning
  • molecular descriptors
  • quantitative structure–activity relationship (QSAR)
  • two-deep cross-validation
  • variable selection

Fingerprint Dive into the research topics of 'Mathematical structural descriptors and mutagenicity assessment: a study with congeneric and diverse datasets<sup>$</sup>'. Together they form a unique fingerprint.

Cite this