Evolution and natural selection have produced an extraordinary diversity in shape and form among microbes, plants, insects, animals, and humans. This project develops new statistical methods for understanding how genetics and environmental factors combine to influence plant and animal morphology. Leaf morphology is a key plant phenomenon that is sensitive to environmental stimulus and is important for plant growth. Studying leaf shape leads to a better understanding of ecology, evolution and atmosphere-biosphere interactions and implications (e.g., climate change). This project focuses on the species Arabidopsis thaliana, a small flowering Eurasian plant that is commonly used in plant genetics studies because of the nature of its genome and the availability of its genetic data. This project promotes interdisciplinary cooperation and training between investigators and students in statistics, image analysis, genetics, biology, and computer science. The tools produced by this project are applicable for broader data-intensive and analytic research in paleoclimatology, anthropology, agriculture, developmental biology, evolution, ecology, and biomedicine.
In this project, the Arabidpsis thaliana plant is cultivated under controlled conditions in a growth chamber, and photographs document the shapes of mature leaves. Then high-dimensional curves are used to create an accurate representation of individual leaf shape. With each shape uniquely described, a two-stage statistical model, integrating Bayesian Lasso and Functional Data Analysis, is utilized to detect significant genes that regulate shape. A subset of genetic markers, single nucleotide polymorphisms (SNPs), which significantly effect shape and respond to environmental factors, is selected. Then the high-dimensional curve modeling is used to gain knowledge of the detailed genetic and biological functions of the candidate SNP markers. Combining high-dimensional data analysis that assesses leaf shape with the examination of thousands of SNP markers requires computationally intensive methods, both with respect to algorithms and data management. The tools developed will be made available in user-friendly R and Matlab packages designed for high-performance computing environments, for broad use by researchers with similar interests in shape analysis.