Software for the statistical analysis of microarray probe level data

Irizarry, Rafael

Abstract

Microarray technology has become a standard tool in medical science and basic biology research. A major achievement of the technology is the successful development of an FDA approved breast cancer recurrence assay making it possible to identify patients at risk of distant recurrence following surgery. Microarrys have also become the standard tool of genome wide association studies (GWAS) which, according to Francis collins, have led to """"""""an astounding number of common DNA variations that play a part in the risk of developing common diseases such as heart disease, diabetes, cancer or autoimmunity"""""""". Approximately one half of all PubMed publications citing microarrays were published during the last 2 years (15,275 published during 2009-2010;15,926 published prior to 2009). We therefore expect that laboratories in academia and industry will continue to rely on these technologies for several years and that manufacturers will continue to develop new products at a rapid pace. With microarray technologies, a number of critical steps are required to convert raw measures into the data relied upon by biologists and clinicians. These data manipulations referred to as preprocessing, have enormous influence on the quality of the ultimate measurements and on the studies that rely upon them. However, the typical analysis software does not provide access to raw probe-level data. Our group has previously demonstrated that the use of alternative methodology can substantially improve accuracy and precision, relative to ad-hoc procedures introduced by default tools provided by the manufacturers. Through our suite of Bioconductor packages, we offer a flexible environment for statistical computing that continues to be the most widely used tool for the analysis of microarray probe-level data. During the last decade, much of our research has been dedicated to understanding the bias and systematic errors that can arise in high-throughput technologies. Systematic errors obscure results, thwart discovery, and contribute to findings that are not reproducible. The challenges for removing systematic errors are not isolated to array-based technologies. For example, similar problems to those encountered in microarrays have been reported for second generation sequencing raw data. For microarrays, we have amassed a substantial knowledge base and data analysis tools to effectively preprocess raw data, making the technology prime for translational research and clinical applications. Our software tools have partly facilitated this achievement and will play an important role in the promising next period of research driven by microarray technology. We are therefore responding to the request for application (RFA) for the continued development and maintenance of software, by proposing to continue to provide our successful and widely used resources.

Public Health Relevance

The research community has amassed substantial knowledge and developed reliable data analysis tools that effectively deal with bias and systematic error in microarray technology. The technology is prime for translational research and clinical applications. Our software tools have partly facilitated this achievement and will play an important role in the promising next period of research driven by microarray technology.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Research Resources (NCRR)
Type: Research Project (R01)
Project #: 2R01RR021967-04
Application #: 8237124
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Sheeley, Douglas

Project Start: 2005-07-01
Project End: 2014-07-31
Budget Start: 2011-09-15
Budget End: 2012-07-31
Support Year: 4
Fiscal Year: 2011
Total Cost: $373,321
Indirect Cost

Institution

Name: Johns Hopkins University
Department: Biostatistics & Other Math Sci
Type: Schools of Public Health
DUNS #: 001910777

City: Baltimore
State: MD
Country: United States
Zip Code: 21218

Related projects


NIH 2011 R01 RR	Software for the statistical analysis of microarray probe level data Irizarry, Rafael Angel / Johns Hopkins University	$373,321
NIH 2009 R01 RR	Software for the Statistical Analysis of Microarray Probe Level Data Irizarry, Rafael Angel / Johns Hopkins University	$267,792
NIH 2008 R01 RR	Software for the Statistical Analysis of Microarray Probe Level Data Irizarry, Rafael Angel / Johns Hopkins University	$277,192
NIH 2007 R01 RR	Software for the Statistical Analysis of Microarray Probe Level Data Irizarry, Rafael Angel / Johns Hopkins University	$303,446

Publications

Kumar, M Senthil; Slud, Eric V; Okrah, Kwame et al. (2018) Analysis and correction of compositional bias in sparse sequencing count data. BMC Genomics 19:799

Hicks, Stephanie C; Irizarry, Rafael A (2015) quantro: a data-driven approach to guide the choice of an appropriate normalization method. Genome Biol 16:117

Timp, Winston; Bravo, Hector Corrada; McDonald, Oliver G et al. (2014) Large hypomethylated blocks as a universal defining epigenetic alteration in human solid tumors. Genome Med 6:61

Parker, Hilary S; Leek, Jeffrey T (2012) The practical effect of batch on genomic prediction. Stat Appl Genet Mol Biol 11:Article 10

McCall, Matthew N; Jaffee, Harris A; Irizarry, Rafael A (2012) fRMA ST: frozen robust multiarray analysis for Affymetrix Exon and Gene ST arrays. Bioinformatics 28:3153-4

Jaffe, Andrew E; Murakami, Peter; Lee, Hwajin et al. (2012) Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. Int J Epidemiol 41:200-9

Jaffe, Andrew E; Feinberg, Andrew P; Irizarry, Rafael A et al. (2012) Significance analysis and statistical dissection of variably methylated regions. Biostatistics 13:166-78

Leek, Jeffrey T; Johnson, W Evan; Parker, Hilary S et al. (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28:882-3

Scharpf, Robert B; Irizarry, Rafael A; Ritchie, Matthew E et al. (2011) Using the R Package crlmm for Genotyping and Copy Number Estimation. J Stat Softw 40:1-32

McCall, Matthew N; Murakami, Peter N; Lukk, Margus et al. (2011) Assessing affymetrix GeneChip microarray quality. BMC Bioinformatics 12:137

Showing the most recent 10 out of 31 publications

Comments

Be the first to comment on Rafael Irizarry's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: