Title: Novel Statistical Methods in Analyzing Microbiome Data for Longitudinal Study Abstract: Recent research demonstrates that changes in the microbiome can have considerable health implications such as malnutrition, asthma, obesity, diabetes, and other conditions. This has further promoted substantial interest in the microbiome from both basic and clinical perspectives, and prospective longitudinal studies have been conducted to probe the mechanisms on how the microbiome affects health and disease. However, the special structure and characteristics of high-dimensional compositional microbiome data complicate effective analysis of microbiome data. In particular: 1) microbiome data is compositional; 2) microbiome data is high dimensional; 3) bacterial taxa are related evolutionarily by a phylogenetic tree; and 4) microbiome compositions are often quantified as sparse compositional data vectors (sparse vectors of proportions with unit sum). Limitations of proper statistical methods to analyze this unique high dimensional data hinder our ability to make inferences or draw conclusions about the role of the microbiome in human health and disease. Motivated by the challenges we have encountered during collaborative longitudinal microbiome studies on the effect of altered microbiome on the development of Type I Diabetes, we propose to accomplish the following specific aims: (1) to design a framework to compare the temporal changes of microbiome between groups; (2) to develop longitudinal mediation models to infer causality in microbe-induced complex trait/disease (diabetes) studies; (3) to develop a unified, powerful, and robust statistical framework to test the association between bacteria taxa and diseases/traits in microbiome studies; and 4) to develop, distribute and support the freely available software packages for the methods we develop. The methods will be evaluated through analytical approaches, computer simulations, and applications to multiple real datasets. The long-term goals of this application are to develop and implement novel statistical methods to study the dynamics of the microbiome composition, to identify key bacterial species that affect susceptibility to complex traits, to apply these methods to facilitate ongoing studies, and to disseminate these tools to the general research community.
We will develop and implement novel statistical methods to study the temporal change of microbiome composition between groups defined by treatment or interested phenotype, probe the causal relationships between disruption of the microbiome and human disease, and identify key bacteria taxa that affect susceptibility to complex traits. We will apply these methods to ongoing studies to promote biological discovery, and disseminate these tools to the general research community.