It is becoming feasible to generate massive quantities of DNA sequence data for disease association studies. This presents both challenges and opportunities for human genetics. Perhaps most importantly, with large scale resequencing data, it will be possible to start identifying genes at which rare variants contribute to disease susceptibility. Here we propose to create a number of the analytical tools that will be needed for analyzing and interpreting the forthcoming data. Our first two aims focus on using some of the first genome-wide resequencing data to better annotate noncoding sites that are likely to be functional.
Our third aim develops statistical methods for analyzing data that emerge from disease association studies to identify genes with rare variants that contribute to disease. The statistical methods will use the annotation approaches that we will develop in the first two aims to prioritize variants according to the likelihood that they might have biological function. BRIEF NARRATIVE SUMMARY The purpose of this project is to develop new tools for analyzing and interpreting data from DNA resequencing studies. We will develop methods that use resequencing data to identify sites that are likely to be functional, especially in noncoding regions. Using, in part, our improved functional annotation of potentially functional sites, we will also develop new statistical methods to identify genes that contribute to disease phenotypes through the action of many rare variants.
|Blake, Lauren E; Thomas, Samantha M; Blischak, John D et al. (2018) A comparative study of endoderm differentiation in humans and chimpanzees. Genome Biol 19:162|
|Banovich, Nicholas E; Li, Yang I; Raj, Anil et al. (2018) Impact of regulatory variation across human iPSCs and differentiated cells. Genome Res 28:122-131|
|Gymrek, Melissa; Willems, Thomas; Guilmatre, Audrey et al. (2016) Abundant contribution of short tandem repeats to gene expression variation in humans. Nat Genet 48:22-9|
|Raj, Anil; Wang, Sidney H; Shim, Heejung et al. (2016) Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling. Elife 5:|
|Li, Yang I; van de Geijn, Bryce; Raj, Anil et al. (2016) RNA splicing is a primary link between genetic variation and disease. Science 352:600-4|
|Field, Yair; Boyle, Evan A; Telis, Natalie et al. (2016) Detection of human adaptation during the past 2000 years. Science 354:760-764|
|Burrows, Courtney K; Banovich, Nicholas E; Pavlovic, Bryan J et al. (2016) Genetic Variation, Not Cell Type of Origin, Underlies the Majority of Identifiable Regulatory Differences in iPSCs. PLoS Genet 12:e1005793|
|Çal??kan, Minal; Baker, Samuel W; Gilad, Yoav et al. (2015) Host genetic variation influences gene expression response to rhinovirus infection. PLoS Genet 11:e1005111|
|Raj, Anil; Shim, Heejung; Gilad, Yoav et al. (2015) msCentipede: Modeling Heterogeneity across Genomic Sites and Replicates Improves Accuracy in the Inference of Transcription Factor Binding. PLoS One 10:e0138030|
|Battle, Alexis; Khan, Zia; Wang, Sidney H et al. (2015) Genomic variation. Impact of regulatory variation from RNA to protein. Science 347:664-7|
Showing the most recent 10 out of 32 publications