(Data Management and Analysis Core: Aikseng Ooi and Nirav Merchant) The University of Arizona Superfund Research Program (UA SRP) will generate volumes and types of data that are not manageable in traditional laboratory settings. The Data Management and Analysis Core (DMAC) will function as the primary service for UA SRP into large biological, geophysical, and chemical datasets, including but not limited to RNA sequencing, chromatin immunoprecipitation sequencing, exome sequencing, metabolomics, metagenomics, microbiome amplicon sequencing, geospatial positioning, analytical chemistry, and imaging. DMAC enables investigators by performing three core functions: (1) DMAC will lead the housing of all data in an easy-to-access data repository system: CyVerse. Cyverse is a computational infrastructure consisting of hardware, software, and personnel that are designed to handle huge datasets and complex analyses, and is maintained at the University of Arizona. DMAC will utilize a reference implementation (RI) that divides data into five different levels for easy data sharing, processing, and analyzing. Lowest levels (level 1) will be raw data, while higher levels (level 5) will be file formats utilizable in graphics visualizations. DMAC will support these processes with help from on-staff statisticians and bioinformaticians who can devise analysis strategies for individual investigators. In addition to data storage, DMAC will orchestrate sample management using Fulcrum software. Fulcrum allows barcoding, global positioning, and annotation of biological samples in an easy-to-use application available on both traditional workstations and mobile platforms. Fulcrum is critical for point-of-generation sample tracking due to its mobility. (2) Beyond data and sample management, DMAC will perform both standard and custom computational analyses of the data. This will include DMAC-lead investigations into ?feature signatures?, which address the predictability of data across UA SRP projects; for example, can the gene expression changes associated with a particular arsenic treatment predict metagenomics changes in a similarly treated sample? In conjunction with UA SRP investigators, DMAC will apply traditional algorithms, or develop novel algorithms as needed, to identify signatures for the different data types collected. (3) The storage and analytical capabilities of DMAC will be integrated into a user-friendly web application that allows individual investigators to retrieve, manipulate, and visualize UA SRP data. The web application will be implemented using an in-house maintained server in conjunction with the R statistical environment. DMAC is thus an integral component of the UA SRP proposal that utilizes state-of-the-art technologies to enable the discovery of novel insights into arsenic exposure and its role in health and disease.
(Data Management and Analysis Core: Aikseng Ooi and Nirav Merchant) Understanding the roles of arsenic in complex diseases like diabetes requires an integrated approach that combines vast quantities of information from multiple fields, including biomedical science, geology, and environmental science. The Data Management and Analysis Core serves as the primary storage and analytical service for the data generated from various experiments investigating arsenic toxicity. DMAC builds upon the CyVerse infrastructure and utilizes state-of-the-art hardware, software, and analytics to merge the findings of various scientific disciplines into new multifaceted insights.
Showing the most recent 10 out of 497 publications