### Abstract

A number of multivariate regression methods commonly used to develop predictive models, along with model validation techniques, are contrary to the current opinion of experts in the field of statistics. Such methods result in overly optimistic models that cannot be relied upon to produce meaningful predictions for new compounds. Ridge regression is one appropriate methodology when the number of independent variables exceeds the number of observations. Although variable reduction is not a necessary component of a ridge regression analysis, descriptor thinning may be applied to eliminate variables that have no relationship to the property or activity of interest in an effort to increase model interpretability; although it is critical that this process be carried out correctly. In this paper, we have developed a predictive model for rat fat:air partition coefficient using proper statistical techniques. For comparative purposes, we have also used stepwise ordinary least squares regression, commonly used in QSAR studies but which often results in an inflated "naïve" q2. It is important to note that all descriptors used in this analysis are computed strictly from chemical structure without the need for any additional experimental input and, therefore, can be applied to any chemical, real or hypothetical, in order to assess the pharmacokinetics and toxic potential.

Original language | English (US) |
---|---|

Title of host publication | Computation in Modern Science and Engineering - Proceedings of the International Conference on Computational Methods in Science and Engineering 2007 (ICCMSE 2007) |

Pages | 548-551 |

Number of pages | 4 |

Edition | 2 |

DOIs | |

Publication status | Published - Dec 1 2007 |

Event | International Conference on Computational Methods in Science and Engineering 2007, ICCMSE 2007 - Corfu, Greece Duration: Sep 25 2007 → Sep 30 2007 |

### Publication series

Name | AIP Conference Proceedings |
---|---|

Number | 2 |

Volume | 963 |

ISSN (Print) | 0094-243X |

ISSN (Electronic) | 1551-7616 |

### Other

Other | International Conference on Computational Methods in Science and Engineering 2007, ICCMSE 2007 |
---|---|

Country | Greece |

City | Corfu |

Period | 9/25/07 → 9/30/07 |

### Fingerprint

### Keywords

- Descriptor thinning
- Gram-Schmidt
- Mathematical descriptors
- Overfitting
- Ridge regression
- Stepwise regression

### Cite this

*Computation in Modern Science and Engineering - Proceedings of the International Conference on Computational Methods in Science and Engineering 2007 (ICCMSE 2007)*(2 ed., pp. 548-551). (AIP Conference Proceedings; Vol. 963, No. 2). https://doi.org/10.1063/1.2836137