TY - JOUR
T1 - Yield prediction through integration of genetic, environment, and management data through deep learning
AU - Kick, Daniel R.
AU - Wallace, Jason G.
AU - Schnable, James C.
AU - Kolkman, Judith M.
AU - Alaca, Barış
AU - Beissinger, Timothy M.
AU - Edwards, Jode
AU - Ertl, David
AU - Flint-Garcia, Sherry
AU - Gage, Joseph L.
AU - Hirsch, Candice N.
AU - Knoll, Joseph E.
AU - de Leon, Natalia
AU - Lima, Dayane C.
AU - Moreta, Danilo E.
AU - Singh, Maninder P.
AU - Thompson, Addie
AU - Weldekidan, Teclemariam
AU - Washburn, Jacob D.
N1 - Publisher Copyright:
© 2023 Genetics Society of America. All rights reserved.
PY - 2023/4
Y1 - 2023/4
N2 - Accurate prediction of the phenotypic outcomes produced by different combinations of genotypes, environments, and management interventions remains a key goal in biology with direct applications to agriculture, research, and conservation. The past decades have seen an expansion of new methods applied toward this goal. Here we predict maize yield using deep neural networks, compare the efficacy of 2 model development methods, and contextualize model performance using conventional linear and machine learning models. We examine the usefulness of incorporating interactions between disparate data types. We find deep learning and best linear unbiased predictor (BLUP) models with interactions had the best overall performance. BLUP models achieved the lowest average error, but deep learning models performed more consistently with similar average error. Optimizing deep neural network submodules for each data type improved model performance relative to optimizing the whole model for all data types at once. Examining the effect of interactions in the best-performing model revealed that including interactions altered the model’s sensitivity to weather and management features, including a reduction of the importance scores for timepoints expected to have a limited physiological basis for influencing yield—those at the extreme end of the season, nearly 200 days post planting. Based on these results, deep learning provides a promising avenue for the phenotypic prediction of complex traits in complex environments and a potential mechanism to better understand the influence of environmental and genetic factors.
AB - Accurate prediction of the phenotypic outcomes produced by different combinations of genotypes, environments, and management interventions remains a key goal in biology with direct applications to agriculture, research, and conservation. The past decades have seen an expansion of new methods applied toward this goal. Here we predict maize yield using deep neural networks, compare the efficacy of 2 model development methods, and contextualize model performance using conventional linear and machine learning models. We examine the usefulness of incorporating interactions between disparate data types. We find deep learning and best linear unbiased predictor (BLUP) models with interactions had the best overall performance. BLUP models achieved the lowest average error, but deep learning models performed more consistently with similar average error. Optimizing deep neural network submodules for each data type improved model performance relative to optimizing the whole model for all data types at once. Examining the effect of interactions in the best-performing model revealed that including interactions altered the model’s sensitivity to weather and management features, including a reduction of the importance scores for timepoints expected to have a limited physiological basis for influencing yield—those at the extreme end of the season, nearly 200 days post planting. Based on these results, deep learning provides a promising avenue for the phenotypic prediction of complex traits in complex environments and a potential mechanism to better understand the influence of environmental and genetic factors.
KW - GEM
KW - convolutional neural network
KW - deep learning
KW - gene-by-environment interaction (G×E)
KW - phenotypic prediction
UR - http://www.scopus.com/inward/record.url?scp=85158066353&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85158066353&partnerID=8YFLogxK
U2 - 10.1093/g3journal/jkad006
DO - 10.1093/g3journal/jkad006
M3 - Article
C2 - 36625555
AN - SCOPUS:85158066353
SN - 2160-1836
VL - 13
JO - G3: Genes, Genomes, Genetics
JF - G3: Genes, Genomes, Genetics
IS - 4
M1 - jkad006
ER -