The human microbiome, or the collection of microorganisms that inhabits the human body, plays an important role in wide-ranging aspects of human health and disease, including conditions such as inflammatory bowel disease, diabetes, psoriasis, childhood asthma, and chronic obstructive pulmonary disease. There has been growing interest in studying the microbiome across time either to assess its stability or to characterize changes associated with disease onset, progression, or treatment. However, few statistical methods exist for analysis of microbiome data from longitudinal studies (i.e., studies that track individuals across time). Therefore, there is a growing need for rigorous statistical methods to analyze longitudinal microbiome data. The proposed research will address this need in the context of two types of questions: whether changes in microbiome composition as a whole are associated with an outcome, and how we can use multiple measurements on an individual to discover which taxa are associated with the outcome. To address the first question, a community-level test of association between the entire microbiome and an outcome will be developed within the nonparametric kernel machine regression framework. The kernel, which summarizes the relationship between microbiome and outcome, will be based on a novel measure of distance that incorporates phylogenetic information and summarizes the extent of change within and between individuals. For the second question, variable selection techniques will be used with generalized estimating equations to account for correlation between repeated measurements on the same set of individuals. This will result in small models that include only the taxa that are relevant to the outcome. Because information may be gained by grouping taxa based on phylogenetic or functional relationships, in addition to selecting individual taxa this method will allow selection among pre-defined groups of taxa. In combination, these two methods will enable powerful and robust inference of relationships between the microbiome and health outcomes in longitudinal studies. To demonstrate the utility of these methods, we will apply them to a longitudinal study of graft-versus-host disease. Because they use all of the data collected across time and incorporate phylogenetic and/or functional relationships between taxa, these methods will allow detection of novel relationships between graft-versus-host disease and microbial communities or specific taxa. !
The human microbiome, or the collection of microorganisms that inhabits the human body, has been implicated in a vast range of diseases, and studying the microbiome across time could provide new insights into disease associations. Since statistical methods for longitudinal (time series) microbiome studies are lacking, we will develop new methods to analyze the interrelationship between changing microbial communities and changes in disease state or severity. We will apply these advances to clarify the role of the microbiome in graft-versus- host disease and identify possible therapeutic strategies. !