Rare variants may be responsible for a significant amount of the uncharacterized genetic risk underlying many diseases. Large cohorts are necessary to have sufficient power to test such variants. However, assessing rare variants with next generation sequencing is still too cost and time prohibitive to be used on a very large scale. We propose an innovative project to impute rare variants using the existing and ever-growing amounts of whole-genome and whole-exome sequence data into an extremely large cohort of 100,000 individuals who have been genotyped at over 650,000 single nucleotide polymorphisms (SNPs). It is well known that the ability to impute a variant depends on the number of individuals carrying that variant in the reference panel, but it is still not clear how well imputation can wor for very rare variants. By combining all the available public reference panels we aim to increase the number of referent subjects 10-fold beyond the 1,092 individuals typically used from the 1000 Genomes Project. We will test the validity of our approach by application to telomere length, which has been measured in the same 100,000 individuals that were genotyped. Telomere length is an important characteristic reflecting cellular aging. It is known to decline with age, and has demonstrated associations with cardiovascular disease and its risk factors, cancer, diabetes, and mortality, but the heritability of telomere length has not been fully explained. Understanding the genetic factors underlying telomere length will lead to a better understanding of telomere biology, with obvious health implications.

Public Health Relevance

Rare variants may explain the missing heritability (genetic risk) of many diseases. We aim to use existing public reference genome sequence data to statistically impute rare variants into an existing cohort of 100,000 well-phenotyped individuals. We will test imputed SNPs for association with telomere length, and they may be examined for association with many other phenotypes in the cohort by other researchers.

Agency
National Institute of Health (NIH)
Institute
National Institute on Aging (NIA)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21AG046616-02
Application #
8741922
Study Section
Genetics of Health and Disease Study Section (GHD)
Program Officer
Guo, Max
Project Start
2013-09-30
Project End
2015-06-30
Budget Start
2014-07-01
Budget End
2015-06-30
Support Year
2
Fiscal Year
2014
Total Cost
Indirect Cost
Name
University of California San Francisco
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
City
San Francisco
State
CA
Country
United States
Zip Code
94143
Hoffmann, Thomas J; Theusch, Elizabeth; Haldar, Tanushree et al. (2018) A large electronic-health-record-based genome-wide study of serum lipids. Nat Genet 50:401-413
Hoffmann, Thomas J; Choquet, Hélène; Yin, Jie et al. (2018) A Large Multiethnic Genome-Wide Association Study of Adult Body Mass Index Identifies Novel Loci. Genetics 210:499-515
Jorgenson, Eric; Melles, Ronald B; Hoffmann, Thomas J et al. (2016) Common coding variants in the HLA-DQB1 region confer susceptibility to age-related macular degeneration. Eur J Hum Genet 24:1049-55
Hoffmann, Thomas J; Van Den Eeden, Stephen K; Sakoda, Lori C et al. (2015) A large multiethnic genome-wide association study of prostate cancer identifies novel risk variants and substantial ethnic differences. Cancer Discov 5:878-91
Hoffmann, Thomas J; Witte, John S (2015) Strategies for Imputing and Analyzing Rare Variants in Association Studies. Trends Genet 31:556-563
Shen, Ling; Hoffmann, Thomas J; Melles, Ronald B et al. (2015) Differences in the Genetic Susceptibility to Age-Related Macular Degeneration Clinical Subtypes. Invest Ophthalmol Vis Sci 56:4290-9
Hoffmann, Thomas J; Sakoda, Lori C; Shen, Ling et al. (2015) Imputation of the rare HOXB13 G84E mutation and cancer risk in a large population-based cohort. PLoS Genet 11:e1004930