Predicting aqueous solubility of environmentally relevant compounds from molecular features: A simple but highly effective four-dimensional model based on Project to Latent Structures

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

The aqueous solubility (log S) of xenobiotic chemicals has been identified as a key characteristic in determining their bioaccessibility/bioavailability and their fate and transport in aquatic environments. We here explore and evaluate the use of a state-of-the-art data analysis technique (Project to Latent Structures, PLS) to estimate log S of environmentally relevant chemicals. A large number (n=624) of molecular descriptors was computed for over 1400 organic chemicals, and then refined by a feature selection technique. Candidate predictor descriptors were fitted to data by means of PLS, which was optimized by an internal leave-one-out cross-validation technique and validated by an external data set. The final (best) PLS model with only four variables (AlogP, X1. sol, Mv, and E) exhibited noteworthy stability and good predictive power. It was able to explain 91% of the data (n=1400) variance with an average absolute error of 0.5 log units through the solubilities span over 12 orders of magnitude. The newly proposed model is transparent, easily portable from one user to another, and robust enough to accurately estimate log S of a wide range of emerging contaminants.

Original languageEnglish (US)
Pages (from-to)5362-5370
Number of pages9
JournalWater Research
Volume47
Issue number14
DOIs
StatePublished - Sep 5 2013

Bibliographical note

Funding Information:
This work was supported by the Postdoctoral Fellowship Program of the St. Anthony Fall Laboratory. The authors thank anonymous reviewers for their constructive comments.

Keywords

  • Aqueous solubility
  • Environmental contaminants
  • Environmental mobility
  • Partial least-squares regression
  • QSPRs
  • Water quality

Fingerprint

Dive into the research topics of 'Predicting aqueous solubility of environmentally relevant compounds from molecular features: A simple but highly effective four-dimensional model based on Project to Latent Structures'. Together they form a unique fingerprint.

Cite this