Development of a two-step indirect method for modeling ecom50

Lowell H. Hall, L. Mark Hall, Dennis W. Hill, Douglas M. Hawkins, Ming Hui Chen, David F. Grant

Research output: Contribution to journalArticle

Abstract

A novel approach is developed for modeling situations in which the modeled property is an algebraically transformed version of the original experimental data. In many cases such a transformation results in a data set with a significantly smaller data range. Here we explore the effects of range-of-data on modeling statistics. We illustrate a two step method using data on the mass spectrometry collision energy (CE) that is required to decompose 50% of precursor ions to fragments (CE50). Earlier we showed that a nonlinear center-of-mass transformation, yielding Ecom50, produces values less dependent on the specific mass spectrometric experimental conditions. For this data set the Ecom50 range is 13.5% of the CE50 range. We propose a two-step modeling method. First, the original experimental data, CE50, (larger range-of-data) is modeled by a standard modeling method (PLS). Second, the calculated dependent variable resulting from the modeling is algebraically transformed (not modeled) according to the center-of-mass transformation, providing the generally more useful data, Ecom50. As shown here, use of this two-step method for predicting Ecom50 (from previously published data) produces a standard error 21% smaller and correspondingly reduces the confidence interval for prediction. Some specific implications for prediction are given for a published data set. This work is part of the ongoing development of a system of models to assist in the development of human metabolites.

Original languageEnglish (US)
Pages (from-to)374-382
Number of pages9
JournalCurrent computer-aided drug design
Volume10
Issue number4
DOIs
Publication statusPublished - Jan 1 2014

    Fingerprint

Keywords

  • Collison energy at 50% reduction (CE)
  • Molconn structure descriptors
  • PLS models
  • PubChem structures prediction
  • Range of data significance

Cite this

Hall, L. H., Hall, L. M., Hill, D. W., Hawkins, D. M., Chen, M. H., & Grant, D. F. (2014). Development of a two-step indirect method for modeling ecom50. Current computer-aided drug design, 10(4), 374-382. https://doi.org/10.2174/1573409911666141231113516