The specific objective of the Data Management, Analysis, and Resources Dissemination Core is to provide systems and bioinformatics computing infrastructure necessary for tracking, storing, integrating, and disseminating the data and biological resources generated in Research Projects 1-4 and the Technology and Immunology Cores, Cores B and D, respectively. Data and resources from the Projects will be made available to the wider scientific community through submissions to public archives, where applicable, or a project portal when such archives do not exist. Using our customized Laboratory Information Management System (LIMS), project tracking systems, and integrated web tools, we will track all samples, data, IRB documentation, and their respective metadata throughout each of the Research Projects. We expect the Research Projects and Cores to generate significant amounts of data, and we will continue to scale our high-capacity data storage infrastructure to store, manage, and share these data with the scientific community. We will deploy the Open Science Data Framework (OSDF), a scalable file system platform developed at IGS for the storage of different data types along with their metadata. Finally, we will work with the Research Project PIs to ensure that all the primary data and derived analyses results will be made available to the wider scientific community. This will be accomplished by depositing the primary data in public archives such as the NCBI Short Read Archive (SRA), and the derived results such as genome assemblies, sequence variants, and expression data in NCBI Genbank, dbSNP, and the Gene Expression Omnibus (GEO). In addition to deposition of data to public archives, we will build a website for the scientific community to access all raw and custom data sets, analysis tools and standard operating procedure (SOP) documents.
The proposed research in this U19 award will use state-of-the-art genomics approaches to study a variety of high-priority pathogens to better understand how they cause disease in infected individuals. This Data Management, Analysis and Resources Dissemination Core will ensure data and resources from the Projects will be made available to the wider scientific community through submissions to public archives.
|Tallon, Luke J; Liu, Xinyue; Bennuru, Sasisekhar et al. (2014) Single molecule sequencing and genome assembly of a clinical specimen of Loa loa, the causative agent of loiasis. BMC Genomics 15:788|
|Crabtree, Jonathan; Agrawal, Sonia; Mahurkar, Anup et al. (2014) Circleator: flexible circular visualization of genome-associated data with BioPerl and SVG. Bioinformatics 30:3125-7|