TY - JOUR
T1 - A Big Data Guide to Understanding Climate Change
T2 - The Case for Theory-Guided Data Science
AU - Faghmous, James H.
AU - Kumar, Vipin
N1 - Publisher Copyright:
© Copyright 2014, Mary Ann Liebert, Inc.
PY - 2014/9
Y1 - 2014/9
N2 - Global climate change and its impact on human life has become one of our era's greatest challenges. Despite the urgency, data science has had little impact on furthering our understanding of our planet in spite of the abundance of climate data. This is a stark contrast from other fields such as advertising or electronic commerce where big data has been a great success story. This discrepancy stems from the complex nature of climate data as well as the scientific questions climate science brings forth. This article introduces a data science audience to the challenges and opportunities to mine large climate datasets, with an emphasis on the nuanced difference between mining climate data and traditional big data approaches. We focus on data, methods, and application challenges that must be addressed in order for big data to fulfill their promise with regard to climate science applications. More importantly, we highlight research showing that solely relying on traditional big data techniques results in dubious findings, and we instead propose a theory-guided data science paradigm that uses scientific theory to constrain both the big data techniques as well as the results-interpretation process to extract accurate insight from large climate data.
AB - Global climate change and its impact on human life has become one of our era's greatest challenges. Despite the urgency, data science has had little impact on furthering our understanding of our planet in spite of the abundance of climate data. This is a stark contrast from other fields such as advertising or electronic commerce where big data has been a great success story. This discrepancy stems from the complex nature of climate data as well as the scientific questions climate science brings forth. This article introduces a data science audience to the challenges and opportunities to mine large climate datasets, with an emphasis on the nuanced difference between mining climate data and traditional big data approaches. We focus on data, methods, and application challenges that must be addressed in order for big data to fulfill their promise with regard to climate science applications. More importantly, we highlight research showing that solely relying on traditional big data techniques results in dubious findings, and we instead propose a theory-guided data science paradigm that uses scientific theory to constrain both the big data techniques as well as the results-interpretation process to extract accurate insight from large climate data.
UR - http://www.scopus.com/inward/record.url?scp=84991818059&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84991818059&partnerID=8YFLogxK
U2 - 10.1089/big.2014.0026
DO - 10.1089/big.2014.0026
M3 - Article
AN - SCOPUS:84991818059
SN - 2167-6461
VL - 2
SP - 155
EP - 163
JO - Big Data
JF - Big Data
IS - 3
ER -