Predicting crop yield is central to addressing emerging challenges in food security, particularly in an era of global climate change. Currently, machine learning and crop modeling are among the most commonly used approaches for yield prediction. This award supports fundamental research to combine the strengths of machine learning and crop models. Machine learning algorithms will be used to predict intermediate plant traits, which will then be fed into a crop model to predict grain yields across different environment and field management practices. Both conception and execution of this EAGER project depend on collaborations across multiple disciplines, including high-throughput phenotyping, object recognition, machine learning, optimization, computer simulation, and crop modeling. If successful, this research is expected to improve not only accuracy but also interpretability of yield prediction models, which will open numerous opportunities for downstream research and discoveries. The interdisciplinary effort will enhance the impact of science and engineering education across disciplines, while providing a collaborative and inclusive environment for all students to engage in cutting edge research activities.
Underlying yield prediction is one of the grand challenges of biology: understanding how phenotype is determined by genotype, environment, and their interactions. Machine learning algorithms are able to predict crop phenotype to reasonable accuracy based on genotype information, but most models have a black box structure and their results are hard to interpret. On the other hand, crop models offer biological insights into causes of phenotypic variation by providing explicit explanations of the interactions between traits and environmental conditions in different phases of the crop growth cycle, but the collection of trait measurement data and calibration of model coefficients are labor intensive, time consuming, and costly. The proposed approach is a nested model. Deep learning algorithms will be trained to predict leaf appearance rate from genotype and empirically measured trait data. Training data will be extracted from images of plant leaves obtained via field experiments that employ novel phenotyping technique. Next, the resulting predicted traits and environment data will be fed into the crop model to predict yield. If proven effective, this approach can be applied to study other plant traits to improve crop yield prediction.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.