Machine Learning for the Geosciences: Challenges and Opportunities

Anuj Karpatne, Imme Ebert-Uphoff, Sai Ravela, Hassan Ali Babaie, Vipin Kumar

Research output: Contribution to journalArticlepeer-review

302 Scopus citations


Geosciences is a field of great societal relevance that requires solutions to several urgent problems facing our humanity and the planet. As geosciences enters the era of big data, machine learning (ML)-that has been widely successful in commercial domains-offers immense potential to contribute to problems in geosciences. However, geoscience applications introduce novel challenges for ML due to combinations of geoscience properties encountered in every problem, requiring novel research in machine learning. This article introduces researchers in the machine learning (ML) community to these challenges offered by geoscience problems and the opportunities that exist for advancing both machine learning and geosciences. We first highlight typical sources of geoscience data and describe their common properties. We then describe some of the common categories of geoscience problems where machine learning can play a role, discussing the challenges faced by existing ML methods and opportunities for novel ML research. We conclude by discussing some of the cross-cutting research themes in machine learning that are applicable across several geoscience problems, and the importance of a deep collaboration between machine learning and geosciences for synergistic advancements in both disciplines.

Original languageEnglish (US)
Article number8423072
Pages (from-to)1544-1554
Number of pages11
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number8
StatePublished - Aug 1 2019

Bibliographical note

Funding Information:
The authors of this paper are supported by inter-disciplinary projects at the interface of machine learning and geoscience, including the NSF Expeditions in Computing grant on “Understanding Climate Change: A Data-driven Approach” (Award #1029711), the NSF-funded 2015 IS-GEO workshop (Award #1533930), and subsequent Research Collaboration Network (EarthCube RCN IS-GEO: Intelligent Systems Research to Support Geosciences, Award #1632211). The vision outlined in this paper has been greatly influenced by the collaborative works in these projects. In particular, the description of geoscience properties in Section 3 has been motivated by the initial discussions at the 2015 IS-GEO workshop (see workshop report [105]). Dr. Ravela’s work was funded in part by an MIT Environmental Solutions Initiative seed fund award, the MIT MISTI program, and a Seaver Institute award.

Funding Information:
There are several communities working on the emerging field of ML for geosciences. These include, but are not limited to, Climate Informatics [7], Climate Change Expeditions [8], and ESSI [9]. More recently, NSF has funded a research coordination network on Intelligent Systems for Geosciences (IS-GEO) [10], with the intent of forging stronger connections between the ML and geoscience communities. On the educational side, the NSF is now funding three related NSF Research Trainee (NRT) programs, namely Data Science for Energy and Environmental Research at the University of Chicago; Environment and Society: Data Sciences for the 21st Century at UC Berkeley; and the Computational Geoscience Program at Stanford University.

Publisher Copyright:
© 1989-2012 IEEE.


  • Earth observation data
  • Earth science
  • Geoscience
  • Machine learning
  • Physics-based models


Dive into the research topics of 'Machine Learning for the Geosciences: Challenges and Opportunities'. Together they form a unique fingerprint.

Cite this