Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt

Abstract This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybri...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Mohsen Shahhosseini, Guiping Hu, Isaiah Huber, Sotirios V. Archontoulis
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/33f06b500da54359bd032ea96d19cf9b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:33f06b500da54359bd032ea96d19cf9b
record_format dspace
spelling oai:doaj.org-article:33f06b500da54359bd032ea96d19cf9b2021-12-02T14:01:20ZCoupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt10.1038/s41598-020-80820-12045-2322https://doaj.org/article/33f06b500da54359bd032ea96d19cf9b2021-01-01T00:00:00Zhttps://doi.org/10.1038/s41598-020-80820-1https://doaj.org/toc/2045-2322Abstract This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybrid models provide the most accurate predictions, and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction. Five ML models (linear regression, LASSO, LightGBM, random forest, and XGBoost) and six ensemble models have been designed to address the research question. The results suggest that adding simulation crop model variables (APSIM) as input features to ML models can decrease yield prediction root mean squared error (RMSE) from 7 to 20%. Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models and we found soil moisture related APSIM variables are most influential on the ML predictions followed by crop-related and phenology-related variables. Finally, based on feature importance measure, it has been observed that simulated APSIM average drought stress and average water table depth during the growing season are the most important APSIM inputs to ML. This result indicates that weather information alone is not sufficient and ML models need more hydrological inputs to make improved yield predictions.Mohsen ShahhosseiniGuiping HuIsaiah HuberSotirios V. ArchontoulisNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-15 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Mohsen Shahhosseini
Guiping Hu
Isaiah Huber
Sotirios V. Archontoulis
Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt
description Abstract This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybrid models provide the most accurate predictions, and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction. Five ML models (linear regression, LASSO, LightGBM, random forest, and XGBoost) and six ensemble models have been designed to address the research question. The results suggest that adding simulation crop model variables (APSIM) as input features to ML models can decrease yield prediction root mean squared error (RMSE) from 7 to 20%. Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models and we found soil moisture related APSIM variables are most influential on the ML predictions followed by crop-related and phenology-related variables. Finally, based on feature importance measure, it has been observed that simulated APSIM average drought stress and average water table depth during the growing season are the most important APSIM inputs to ML. This result indicates that weather information alone is not sufficient and ML models need more hydrological inputs to make improved yield predictions.
format article
author Mohsen Shahhosseini
Guiping Hu
Isaiah Huber
Sotirios V. Archontoulis
author_facet Mohsen Shahhosseini
Guiping Hu
Isaiah Huber
Sotirios V. Archontoulis
author_sort Mohsen Shahhosseini
title Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt
title_short Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt
title_full Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt
title_fullStr Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt
title_full_unstemmed Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt
title_sort coupling machine learning and crop modeling improves crop yield prediction in the us corn belt
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/33f06b500da54359bd032ea96d19cf9b
work_keys_str_mv AT mohsenshahhosseini couplingmachinelearningandcropmodelingimprovescropyieldpredictionintheuscornbelt
AT guipinghu couplingmachinelearningandcropmodelingimprovescropyieldpredictionintheuscornbelt
AT isaiahhuber couplingmachinelearningandcropmodelingimprovescropyieldpredictionintheuscornbelt
AT sotiriosvarchontoulis couplingmachinelearningandcropmodelingimprovescropyieldpredictionintheuscornbelt
_version_ 1718392190485397504