Predicting nicotine metabolism across ancestries using genotypes

James W. Baurley, Andrew W. Bergen, Carolyn M. Ervin, Sung shim Lani Park, Sharon E. Murphy, Christopher S. McMahan

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Background: There is a need to match characteristics of tobacco users with cessation treatments and risks of tobacco attributable diseases such as lung cancer. The rate in which the body metabolizes nicotine has proven an important predictor of these outcomes. Nicotine metabolism is primarily catalyzed by the enzyme cytochrone P450 (CYP2A6) and CYP2A6 activity can be measured as the ratio of two nicotine metabolites: trans-3’-hydroxycotinine to cotinine (NMR). Measurements of these metabolites are only possible in current tobacco users and vary by biofluid source, timing of collection, and protocols; unfortunately, this has limited their use in clinical practice. The NMR depends highly on genetic variation near CYP2A6 on chromosome 19 as well as ancestry, environmental, and other genetic factors. Thus, we aimed to develop prediction models of nicotine metabolism using genotypes and basic individual characteristics (age, gender, height, and weight). Results: We identified four multiethnic studies with nicotine metabolites and DNA samples. We constructed a 263 marker panel from filtering genome-wide association scans of the NMR in each study. We then applied seven machine learning techniques to train models of nicotine metabolism on the largest and most ancestrally diverse dataset (N=2239). The models were then validated using the other three studies (total N=1415). Using cross-validation, we found the correlations between the observed and predicted NMR ranged from 0.69 to 0.97 depending on the model. When predictions were averaged in an ensemble model, the correlation was 0.81. The ensemble model generalizes well in the validation studies across ancestries, despite differences in the measurements of NMR between studies, with correlations of: 0.52 for African ancestry, 0.61 for Asian ancestry, and 0.46 for European ancestry. The most influential predictors of NMR identified in more than two models were rs56113850, rs11878604, and 21 other genetic variants near CYP2A6 as well as age and ancestry. Conclusions: We have developed an ensemble of seven models for predicting the NMR across ancestries from genotypes and age, gender and BMI. These models were validated using three datasets and associate with nicotine dosages. The knowledge of how an individual metabolizes nicotine could be used to help select the optimal path to reducing or quitting tobacco use, as well as, evaluating risks of tobacco use.

Original languageEnglish (US)
Article number663
JournalBMC Genomics
Volume23
Issue number1
DOIs
StatePublished - Dec 2022

Bibliographical note

Funding Information:
The authors thank participants, staff and Investigators of the MEC, the CENIC, the HSS and the METS. The METS were supported by: the National Institute on Drug Abuse Pharmacokinetics and Pharmacodynamics of Nicotine (DA002277, PI: Neal L Benowitz), Young Adult Substance Use-Predictors and Consequences (DA003706, PI: Hy Hops), Pharmacokinetics of Nicotine in Twins (DA011170, PI: Gary E Swan), Pharmacogenetics of Nicotine Addiction Treatment Consortium (DA020830, PI: Neal L Benowitz; MPI: Rachel F Tyndale, and Caryn Lerman); and, by the Tobacco-Related Disease Research Program of the University of California: Nicotine Metabolism in Families (7PT-2004, PI: Neal L Benowitz). The MEC was supported by the National Cancer Institute (U01 CA164973 and P01 CA138338). CENIC was supported by a grant from the National Institute on Drug Abuse and the Food and Drug Administration Center for Tobacco Products (U54 DA031659). The HSS was supported by National Cancer Institute (R01 CA 85997). We would like to acknowledge BioRealm LLC team (https://biorealm.ai) for supporting project workflows and computation and IBX (http://ibx.bio) for sample processing.

Funding Information:
The authors thank participants, staff and Investigators of the MEC, the CENIC, the HSS and the METS. The METS were supported by: the National Institute on Drug Abuse Pharmacokinetics and Pharmacodynamics of Nicotine (DA002277, PI: Neal L Benowitz), Young Adult Substance Use-Predictors and Consequences (DA003706, PI: Hy Hops), Pharmacokinetics of Nicotine in Twins (DA011170, PI: Gary E Swan), Pharmacogenetics of Nicotine Addiction Treatment Consortium (DA020830, PI: Neal L Benowitz; MPI: Rachel F Tyndale, and Caryn Lerman); and, by the Tobacco-Related Disease Research Program of the University of California: Nicotine Metabolism in Families (7PT-2004, PI: Neal L Benowitz). The MEC was supported by the National Cancer Institute (U01 CA164973 and P01 CA138338). CENIC was supported by a grant from the National Institute on Drug Abuse and the Food and Drug Administration Center for Tobacco Products (U54 DA031659). The HSS was supported by National Cancer Institute (R01 CA 85997). We would like to acknowledge BioRealm LLC team ( https://biorealm.ai ) for supporting project workflows and computation and IBX ( http://ibx.bio ) for sample processing.

Funding Information:
This study was funded by the National Institute on Alcohol Abuse and Alcoholism (R44 AA027675) and the National Institute on Drug Abuse (R43 DA041211). The sponsors had no role in the analysis of data, writing of the report, or in the decision to submit the paper for publication.

Publisher Copyright:
© 2022, The Author(s).

Keywords

  • Machine learning
  • Nicotine biomarkers
  • Nicotine metabolism
  • Polygenic risk score
  • Smoking cessation
  • Statistical learning

Fingerprint

Dive into the research topics of 'Predicting nicotine metabolism across ancestries using genotypes'. Together they form a unique fingerprint.

Cite this