Data Analysis Tools for Emerging High-Throughput Technologies

Irizarry, Rafael

Abstract

Biomedical research and the basic sciences are increasingly dependent on high-throughput technologies that have the ability to simultaneously measure thousands of nucleic acid molecules in a sample. In combination with ingenious laboratory protocols, these technologies have permitted unprecedented ways of studying the molecular basis of disease and phenotypic variation. As a result of the increasing adoption of these technologies, more investigations rely on complex datasets and require the development of new statistical techniques to adequately interpret data. Today, high-throughput technologies applications go far beyond their original task of studying DNA sequence itself and also include the measurement of quantitative and dynamic outcomes such as gene expression levels and DNA methylation (DNAm) status. These quantitative and dynamic outcomes introduce levels of variability that give rise to further data analytic challenges related to distinguishing unwanted sources of variability from bio- logically relevant signals. Furthermore, when measuring these quantitative outcomes, data are subject to severe technological and biological biases that can substantially impact downstream analyses. Our group has previously demonstrated that statistical methodology can provide great improvements over ad-hoc algorithms o?ered as de- faults by technology developers. Our highly cited statistical methodology and our widely used software demonstrate the success of our work. The National Research Council's Frontiers in Massive Data Analysis publication states that, ?the challenges for massive data go beyond the storage, indexing, and querying that have been the province of classical database systems and instead hinge on the ambitious goal of inference?. Inference is particularly relevant in biomedical applications since we often look to draw conclusions based on observed di?erences between groups in the presence of within group variability. Two particularly challenging tasks relate to performing valid inference when 1) we perform scans over large spaces to identify small regions of interests and 2) the data is a?ected by unexpected systematic bias or batch e?ects. We will focus on these two general challenges. Our speci?c proposal is to work on the most urgent needs of researchers facing new challenges as they increasingly rely on high-throughput techniques. We will leverage the expertise of our collaborators to prioritize projects. We greatly appreciate the ?exibility permitted by the R35 mechanism as it will help us maximize the impact of our work.

Public Health Relevance

High-throughput technologies are poised to become instrumental in the era of precision medicine. As a result of the increasing adoption of these technologies, more investigations rely on complex datasets and require the development of new techniques to adequately interpret data. We will develop the necessary statistical methods to help make these technologies primary tools for translational research and clinical applications.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Unknown (R35)
Project #: 1R35GM131802-01
Application #: 9699135
Study Section: Special Emphasis Panel (ZRG1)
Program Officer: Ravichandran, Veerasamy

Project Start: 2019-05-01
Project End: 2024-04-30
Budget Start: 2019-05-01
Budget End: 2020-04-30
Support Year: 1
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: Dana-Farber Cancer Institute
Department
Type
DUNS #: 076580745

City: Boston
State: MA
Country: United States
Zip Code: 02215

Related projects


NIH 2020 R35 GM	Data Analysis Tools for Emerging High-Throughput Technologies Irizarry, Rafael Angel / Dana-Farber Cancer Institute
NIH 2019 R35 GM	Data Analysis Tools for Emerging High-Throughput Technologies Irizarry, Rafael Angel / Dana-Farber Cancer Institute

Comments

Be the first to comment on Rafael Irizarry's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: