It is becoming feasible to generate massive quantities of DNA sequence data for disease association studies. This presents both challenges and opportunities for human genetics. Perhaps most importantly, with large scale resequencing data, it will be possible to start identifying genes at which rare variants contribute to disease susceptibility. Here we propose to create a number of the analytical tools that will be needed for analyzing and interpreting the forthcoming data. Our first two Aims focus on using some of the first genome-wide resequencing data to better annotate noncoding sites that are likely to be functional.
Our third Aim develops statistical methods for analyzing data that emerge from disease association studies to identify genes with rare variants that contribute to disease. The statistical methods will use the annotation approaches that we will develop in the first two Aims to prioritize variants according to the likelihood that they might have biological function. Using, in part, our improved functional annotation of potentially functional sites, we will also develop new statistical methods to identify genes that contribute to disease phenotypes through the action of many rare variants. ? ? ?

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZMH1-ERB-C (06))
Program Officer
Yao, Yin Y
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Chicago
Schools of Medicine
United States
Zip Code
Blake, Lauren E; Thomas, Samantha M; Blischak, John D et al. (2018) A comparative study of endoderm differentiation in humans and chimpanzees. Genome Biol 19:162
Banovich, Nicholas E; Li, Yang I; Raj, Anil et al. (2018) Impact of regulatory variation across human iPSCs and differentiated cells. Genome Res 28:122-131
Gymrek, Melissa; Willems, Thomas; Guilmatre, Audrey et al. (2016) Abundant contribution of short tandem repeats to gene expression variation in humans. Nat Genet 48:22-9
Raj, Anil; Wang, Sidney H; Shim, Heejung et al. (2016) Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling. Elife 5:
Li, Yang I; van de Geijn, Bryce; Raj, Anil et al. (2016) RNA splicing is a primary link between genetic variation and disease. Science 352:600-4
Field, Yair; Boyle, Evan A; Telis, Natalie et al. (2016) Detection of human adaptation during the past 2000 years. Science 354:760-764
Burrows, Courtney K; Banovich, Nicholas E; Pavlovic, Bryan J et al. (2016) Genetic Variation, Not Cell Type of Origin, Underlies the Majority of Identifiable Regulatory Differences in iPSCs. PLoS Genet 12:e1005793
Çal??kan, Minal; Baker, Samuel W; Gilad, Yoav et al. (2015) Host genetic variation influences gene expression response to rhinovirus infection. PLoS Genet 11:e1005111
Raj, Anil; Shim, Heejung; Gilad, Yoav et al. (2015) msCentipede: Modeling Heterogeneity across Genomic Sites and Replicates Improves Accuracy in the Inference of Transcription Factor Binding. PLoS One 10:e0138030
Battle, Alexis; Khan, Zia; Wang, Sidney H et al. (2015) Genomic variation. Impact of regulatory variation from RNA to protein. Science 347:664-7

Showing the most recent 10 out of 32 publications