TY - JOUR
T1 - Assessment of groundwater well vulnerability to contamination through physics-informed machine learning
AU - Soriano, Mario A.
AU - Siegel, Helen G.
AU - Johnson, Nicholaus P.
AU - Gutchess, Kristina M.
AU - Xiong, Boya
AU - Li, Yunpo
AU - Clark, Cassandra J.
AU - Plata, Desiree L.
AU - Deziel, Nicole C.
AU - Saiers, James E.
N1 - Publisher Copyright:
© 2021 The Author(s). Published by IOP Publishing Ltd
PY - 2021/7/22
Y1 - 2021/7/22
N2 - Contamination from anthropogenic activities is a long-standing challenge to the sustainability of groundwater resources. Physically based (PB) models are often used in groundwater risk assessments, but their application to large scale problems requiring high spatial resolution remains computationally intractable. Machine learning (ML) models have emerged as an alternative to PB models in the era of big data, but the necessary number of observations may be impractical to obtain when events are rare, such as episodic groundwater contamination incidents. The current study employs metamodeling, a hybrid approach that combines the strengths of PB and ML models while addressing their respective limitations, to evaluate groundwater well vulnerability to contamination from unconventional oil and gas development (UD). We illustrate the approach in northeastern Pennsylvania, where intensive natural gas production from the Marcellus Shale overlaps with local community dependence on shallow aquifers. Metamodels were trained to classify vulnerability from predictors readily computable in a geographic information system. The trained metamodels exhibited high accuracy (average out-of-bag classification error <5%). A predictor combining information on topography, hydrology, and proximity to contaminant sources (inverse distance to nearest upgradient UD source) was found to be highly important for accurate metamodel predictions. Alongside violation reports and historical groundwater quality records, the predicted vulnerability provided critical insights for establishing the prevalence of UD contamination in 94 household wells that we sampled in 2018. While <10% of the sampled wells exhibited chemical signatures consistent with UD produced wastewaters, >60% were predicted to be in vulnerable locations, suggesting that future impacts are likely to occur with greater frequency if safeguards against contaminant releases are relaxed. Our results show that hybrid physics-informed ML offers a robust and scalable framework for assessing groundwater contamination risks.
AB - Contamination from anthropogenic activities is a long-standing challenge to the sustainability of groundwater resources. Physically based (PB) models are often used in groundwater risk assessments, but their application to large scale problems requiring high spatial resolution remains computationally intractable. Machine learning (ML) models have emerged as an alternative to PB models in the era of big data, but the necessary number of observations may be impractical to obtain when events are rare, such as episodic groundwater contamination incidents. The current study employs metamodeling, a hybrid approach that combines the strengths of PB and ML models while addressing their respective limitations, to evaluate groundwater well vulnerability to contamination from unconventional oil and gas development (UD). We illustrate the approach in northeastern Pennsylvania, where intensive natural gas production from the Marcellus Shale overlaps with local community dependence on shallow aquifers. Metamodels were trained to classify vulnerability from predictors readily computable in a geographic information system. The trained metamodels exhibited high accuracy (average out-of-bag classification error <5%). A predictor combining information on topography, hydrology, and proximity to contaminant sources (inverse distance to nearest upgradient UD source) was found to be highly important for accurate metamodel predictions. Alongside violation reports and historical groundwater quality records, the predicted vulnerability provided critical insights for establishing the prevalence of UD contamination in 94 household wells that we sampled in 2018. While <10% of the sampled wells exhibited chemical signatures consistent with UD produced wastewaters, >60% were predicted to be in vulnerable locations, suggesting that future impacts are likely to occur with greater frequency if safeguards against contaminant releases are relaxed. Our results show that hybrid physics-informed ML offers a robust and scalable framework for assessing groundwater contamination risks.
KW - Drinking water quality
KW - Gas development
KW - Groundwater contamination risk assessment
KW - Metamodeling
KW - Physics-informed machine learning
KW - Unconventional oil
UR - http://www.scopus.com/inward/record.url?scp=85112105546&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112105546&partnerID=8YFLogxK
U2 - 10.1088/1748-9326/ac10e0
DO - 10.1088/1748-9326/ac10e0
M3 - Article
AN - SCOPUS:85112105546
SN - 1748-9318
VL - 16
JO - Environmental Research Letters
JF - Environmental Research Letters
IS - 8
M1 - 084013
ER -