Researchers from very diverse fields are expanding their research to microbiome studies to understand the interactions between microbes, hosts, and the environment. As new technologies for accelerated production of microbiome sequence data have enabled this type of research, there is a pressing need for high performance computation resources that accommodate flexible and consistent configuration, deployment, and execution of constantly improving analytical pipelines and that enable data harmonization and interoperability. Standards for data and experimental representation are still being developed and enhanced, resulting in semantic inconsistency and incompatible data formats and conventions, and therefore presenting data integration and management challenges. Meta-analyses of pooled data are becoming more widespread as computational power increases. Assessment of the sources of variation in microbiota profiling is sorely needed to understand how to combine and integrate data from different studies. As the field rapidly evolves and new sequencing and processing techniques are developed, the use of hard-coded scientific pipelines limits the scope of biological interpretations. We propose to develop a ?Microbiome Meta-Analysis Platform? (MIMAP) that takes advantage of cluster computing, software containerization, and semantic data integration technologies to enable building, modifying and evaluating alternative bioinformatics pipelines for reproducibility studies, new studies, and meta-analysis of microbiome data from different cohorts, from cross-sectional and longitudinal studies, from public sources, collaborators and in-house studies. It enables deployment and testing of existing and emerging bacterial identification and downstream analysis algorithms, substitution of tools to test new approaches, and semantic modeling of data for pooling of multiple studies and for integration of clinical information through a friendly user interface designed with guidance of an expert team of microbiome specialists. It also allows researchers to perform quality control evaluations using positive and negative controls and provenance data. In Phase I, the main workflow execution, data modeling, and evaluation strategies will be prototyped to demonstrate feasibility. During Phase II development, the complete MIMAP system will be created as a solution for the execution of microbiome research.
New technologies have enabled scientists from all fields and disciplines to explore the microbial world and its influence on human health and disease. MIMAP provides a scalable computing solution that enables the configuration, deployment and reproduction of microbiome data analysis and meta- analysis through semantic data integration and distributed scientific workflow execution.