Maize, sorghum, sugarcane, and Miscanthus are the most productive and water efficient crops and biofuels in the world. This productivity is due to a shared physiology and genetic ancestry over the last 15 million years. While these four crops will be extensively used, they are closely related to another 800 species that dominate grasslands across the world and are adapted to numerous environmental stresses including flooding, drought, heat, and frost. The project team will use modern genomics and machine learning to survey and analyze these related species, determining the most important genetic features they share that allow them to adapt to heat and drought. The results of this work will be used by commercial and public sector plant breeders to make maize and sorghum more productive and resilient to extreme weather. Key to this long-term impact is training the next generation of scientists in computational biology to address fundamental questions. These skills will be developed through hackathons and bioinformatics training workshops. The project will communicate this science to the general public through venues such as a traveling museum exhibit.
The Andropogoneae tribe of grasses contains a thousand species that collectively represent over a billion years of evolutionary history. It has used NADP-C4 photosynthesis and a wide range of adaptations to become a dominant clade on earth. This project will use the diversity and evolution across this tribe to understand the rules of adaptive convergence and constraint in plant genomes. The project team will sample and analyze the worldwide spectrum of genetic diversity in Andropogoneae to develop detailed models testing whether (1) quantitative estimates of evolutionary constraint improve predictions of fitness-related traits, and (2) convergent environmental adaptations shared across the Andropogoneae explain a substantial proportion of total adaptive variance. These hypotheses will be tested by assembling the gene and regulatory content of 57 species as well as whole genome sequencing of another 700 species. For eight species, diversity across their natural range of adaptation will be surveyed at the sequence level. Evolutionary and machine learning models will be used to quantify the disruptive impact of a mutation in every ancestral genomic element. The inter and intra-specific surveys will also permit an estimation of the prevalence of convergent evolution. This project addresses two key elements of the genotype to phenotype problem - how to quantify the disruptive impact of mutations and how to determine whether adaptive solutions to environmental stresses are convergently shared across species.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.