Core D: Data Analytics and Bioinformatics Core Project Summary Hypothesis-testing studies designed to understand the phenotypic, molecular and functional differences between antigen-specific B cell subsets found in human lymphoid and peripheral mucosal tissues that are described in this proposal will rely on several ?omics and other data-rich approaches and corresponding analyses to achieve their goals. To support use of these platforms and their associated data analyses, we propose to provide unified and integrated data and informatics services in Core D, the Data Analytics and Bioinformatics Core. The core will cover three broad areas as reflected by the Specific Aims: data management, data processing pipelines, and collaborative, downstream analysis.
In Specific Aim 1, we propose to oversee and implement systems and processes to manage donor demographic and sample processing metadata, resulting data sets (raw through processed) and analysis provenance with all the appropriate linkages. This will benefit the Program by creating infrastructure that can be efficiently used by all the Projects and Cores as well as promoting good data stewardship practices which in turn, supports reproducibility.
In Specific Aim 2, we propose to implement standardized workflows to cover all of the high-throughput platforms used in this Program (by one or more Projects): RNA-seq, single-cell RNA-seq, B cell receptor sequencing (for Sanger sequencing as well as for repertoires, by next-generation sequencing), and multi-parameter flow cytometry using semi-automated approaches. This will benefit the Program by standardizing primary data processing across the Projects and will promote comparability of results.
In Specific Aim 3, we will provide collaborative downstream bioinformatics analytical and statistical support for all three Projects. This will benefit the Program by serving as a resource that all Investigators in the Projects can access for using data analyses to address their hypotheses, and by centralizing this function, we will economize this support as the analytical needs and methodologies of the Projects will overlap. Importantly, this will also enable creation of a molecular atlas of memory B cells and plasma cells across tissues, integrating transcriptomic, phenotypic and B cell receptor repertoire data from the Projects - we will assemble such a data set with an eye to future incorporation into a Data Commons environment. To develop this Core, we have assembled a strong team with experienced leadership and talented individuals with demonstrated expertise in all of the areas covered. Combining these three broad areas into a Core will maximize efficiency, standardize process and promote scientific synergy across the Program.