Statistical methods for genetic data

Storey, John

Abstract

Genetics plays a role in many human diseases, whether the disease itself is inherited or it is associated with a substantial change in the activity of genes. A great opportunity exists to better understand and diagnose human disease by utilizing recently developed technologies that allow one to carry out biological studies at the genome-wide level. There is a substantial need to develop new quantitative tools specifically designed to analyze the enormous amounts of data generated by these studies. The overall goal of the proposed research is to develop statistical methods and software useful in understanding genomic data. The particular focus is in functional genomic, where data from gene expression arrays and large-scale genotyping can be used to study how large numbers of genes work to accomplish various functional roles. Statistical inference techniques for DNA micro array experiments will be developed, specifically identifying genes that are differentially expressed among two or more biological conditions. These techniques will be applicable to both static experiments and time course experiments. Statistical methods for the genetic dissection of transcriptional regulation will also be developed. This includes methods to estimate the genetic control of gene expression at both genome-wide and gene-specific levels, and methods to map loci showing linkage to gene expression. In particular, multiple locus linkage analysis from a model selection approach will be investigated, where new methods will be developed for computationally efficient model generation, selection, and significance analysis. All of these methods will be implemented into user-friendly software that will be freely distributed to the academic community. The methods will also be tested on publicly available data in collaboration with experimentalists, in an effort to verify that the methods provide biologically meaningful results. Overall, this work is aimed at contributing to the understanding of the molecular biology and genetic basis of human disease by providing rigorous analytical tools for genomic studies.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 7R01HG002913-05
Application #: 7600686
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Brooks, Lisa

Project Start: 2004-07-19
Project End: 2008-06-30
Budget Start: 2008-02-01
Budget End: 2008-06-30
Support Year: 5
Fiscal Year: 2007
Total Cost: $187,266
Indirect Cost

Institution

Name: Princeton University
Department
Type
DUNS #: 002484665

City: Princeton
State: NJ
Country: United States
Zip Code: 08544

Related projects

Publications

Hackett, Sean R; Zanotelli, Vito R T; Xu, Wenxin et al. (2016) Systems-level analysis of mechanisms regulating yeast metabolic flux. Science 354:

Ochoa, Alejandro; Storey, John D; Llinás, Manuel et al. (2015) Beyond the E-Value: Stratified Statistics for Protein Domain Prediction. PLoS Comput Biol 11:e1004509

Chung, Neo Christopher; Storey, John D (2015) Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics 31:545-54

Robinson, David G; Wang, Jean Y; Storey, John D (2015) A nested parallel experiment demonstrates differences in intensity-dependence between RNA-seq and microarrays. Nucleic Acids Res 43:e131

Marstrand, Troels T; Storey, John D (2014) Identifying and mapping cell-type-specific chromatin programming of gene expression. Proc Natl Acad Sci U S A 111:E645-54

Robinson, David G; Storey, John D (2014) subSeq: determining appropriate sequencing depth through efficient read subsampling. Bioinformatics 30:3424-6

Robinson, David G; Chen, Wei; Storey, John D et al. (2014) Design and analysis of Bar-seq experiments. G3 (Bethesda) 4:11-8

Kim, Jinhee; Ghasemzadeh, Nima; Eapen, Danny J et al. (2014) Gene expression profiles associated with acute myocardial infarction and risk of cardiovascular death. Genome Med 6:40

Jaffe, Andrew E; Storey, John D; Ji, Hongkai et al. (2013) Gene set bagging for estimating the probability a statistically significant result will replicate. BMC Bioinformatics 14:360

Leek, Jeffrey T; Johnson, W Evan; Parker, Hilary S et al. (2012) The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28:882-3

Showing the most recent 10 out of 25 publications

Comments

Be the first to comment on John Storey's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: