Random forests for global and regional crop yield predictions

Jig Han Jeong, Jonathan P. Resop, Nathaniel D. Mueller, David H. Fleisher, Kyungdahm Yun, Ethan E. Butler, Dennis J. Timlin, Kyo Moon Shim, James S. Gerber, Vangimalla R. Reddy, Soo Hyung Kim

Research output: Contribution to journalArticlepeer-review

278 Scopus citations


Accurate predictions of crop yield are critical for developing effective agricultural and food policies at the regional and global scales. We evaluated a machine-learning method, Random Forests (RF), for its ability to predict crop yield responses to climate and biophysical variables at global and regional scales in wheat, maize, and potato in comparison with multiple linear regressions (MLR) serving as a benchmark. We used crop yield data from various sources and regions for model training and testing: 1) gridded global wheat grain yield, 2) maize grain yield from US counties over thirty years, and 3) potato tuber and maize silage yield from the northeastern seaboard region. RF was found highly capable of predicting crop yields and outperformed MLR benchmarks in all performance statistics that were compared. For example, the root mean square errors (RMSE) ranged between 6 and 14% of the average observed yield with RF models in all test cases whereas these values ranged from 14% to 49% for MLR models. Our results show that RF is an effective and versatile machine-learning method for crop yield predictions at regional and global scales for its high accuracy and precision, ease of use, and utility in data analysis. RF may result in a loss of accuracy when predicting the extreme ends or responses beyond the boundaries of the training data.

Original languageEnglish (US)
Article numbere0156571
JournalPloS one
Issue number6
StatePublished - Jun 1 2016

Bibliographical note

Funding Information:
We thank Elliot Blasich and Marian Hsieh for their assistance with this project. This study was supported by a Cooperative Research Program for Agricultural Science and Technology Development (Project No. PJ01000707), Rural Development Administration, Republic of Korea (SHK; KMS). Additional support was provided in part by a Specific Cooperative Agreement: 58-1265-1-074 between University of Washington and USDA-ARS (SHK; VRR), the USDA-ARS Headquarters Postdoctoral Research Associate Program (DHF), the USDA-NIFA-AFRI Grant no. 2011-68004-30057: Enhancing Food Security of Underserved Populations in the Northeast through Sustainable Regional Food Systems (DHF), the USDA AFRI fellowship 2016-67012-25208 (NDM), the NSF Hydrological Sciences grant 1521210 (NDM), and the Packard Foundation (EEB).

Publisher Copyright:
© This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.


Dive into the research topics of 'Random forests for global and regional crop yield predictions'. Together they form a unique fingerprint.

Cite this