Genetics plays a role in many human diseases, whether the disease itself is inherited or it is associated with a substantial change in the activity of genes. A great opportunity exists to better understand and diagnose human disease by utilizing recently developed technologies that allow one to carry out biological studies at the genome-wide level. There is a substantial need to develop new quantitative tools specifically designed to analyze the enormous amounts of data generated by these studies. The overall goal of the proposed research is to develop statistical methods and software useful in understanding genomic data. The particular focus is in functional genomic, where data from gene expression arrays and large-scale genotyping can be used to study how large numbers of genes work to accomplish various functional roles. Statistical inference techniques for DNA micro array experiments will be developed, specifically identifying genes that are differentially expressed among two or more biological conditions. These techniques will be applicable to both static experiments and time course experiments. Statistical methods for the genetic dissection of transcriptional regulation will also be developed. This includes methods to estimate the genetic control of gene expression at both genome-wide and gene-specific levels, and methods to map loci showing linkage to gene expression. In particular, multiple locus linkage analysis from a model selection approach will be investigated, where new methods will be developed for computationally efficient model generation, selection, and significance analysis. All of these methods will be implemented into user-friendly software that will be freely distributed to the academic community. The methods will also be tested on publicly available data in collaboration with experimentalists, in an effort to verify that the methods provide biologically meaningful results. Overall, this work is aimed at contributing to the understanding of the molecular biology and genetic basis of human disease by providing rigorous analytical tools for genomic studies.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Washington
Schools of Arts and Sciences
United States
Zip Code
Hackett, Sean R; Zanotelli, Vito R T; Xu, Wenxin et al. (2016) Systems-level analysis of mechanisms regulating yeast metabolic flux. Science 354:
Ochoa, Alejandro; Storey, John D; LlinĂ¡s, Manuel et al. (2015) Beyond the E-Value: Stratified Statistics for Protein Domain Prediction. PLoS Comput Biol 11:e1004509
Chung, Neo Christopher; Storey, John D (2015) Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics 31:545-54
Robinson, David G; Wang, Jean Y; Storey, John D (2015) A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays. Nucleic Acids Res 43:e131
Marstrand, Troels T; Storey, John D (2014) Identifying and mapping cell-type-specific chromatin programming of gene expression. Proc Natl Acad Sci U S A 111:E645-54
Robinson, David G; Storey, John D (2014) subSeq: determining appropriate sequencing depth through efficient read subsampling. Bioinformatics 30:3424-6
Robinson, David G; Chen, Wei; Storey, John D et al. (2014) Design and analysis of Bar-seq experiments. G3 (Bethesda) 4:11-8
Kim, Jinhee; Ghasemzadeh, Nima; Eapen, Danny J et al. (2014) Gene expression profiles associated with acute myocardial infarction and risk of cardiovascular death. Genome Med 6:40
Jaffe, Andrew E; Storey, John D; Ji, Hongkai et al. (2013) Gene set bagging for estimating the probability a statistically significant result will replicate. BMC Bioinformatics 14:360
Leek, Jeffrey T; Johnson, W Evan; Parker, Hilary S et al. (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28:882-3

Showing the most recent 10 out of 25 publications