Dimensionality of a relatively big data set (95 compounds) observed for toxicity (mutagenicity) was explored in order to compute QSAR models. Distinct molecular descriptors were used. Dimensionality of data, using PCA, correlation plots and clustering, was evaluated. Analyzing data dimensionality allowed model optimization. Docking studies and PCA were used in order to expand data dimensionality. Pearson correlation coefficient (r2) values, obtained for both perceptive and predictive models, were satisfactory.
Bibliographical noteFunding Information:
This work was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CCCDI – UEFISCDI, project number 8/2015, acronym GEMNS (under the frame of the ERA-NET EuroNanoMed II European Innovative Research and Technological Development Projects in Nanomedicine).
- Ames test
- Data dimensionality
- Principal component analysis (PCA)
- Topological descriptor