Genetic epidemiology has entered the big data era with many cohort studies having access to not only genome-wide genotyping data but also large number of disease-related traits and a variety of biomarkers. These extensive datasets hold great promise for increasing our understanding of human diseases and improving public health. However, statistical tools to leverage these data are severely lacking and the development of innovative methodological approaches remains a key component for future successes. Indeed, most genetic association studies still utilize standard univariate approach, testing each measured phenotype independently for association with each single genetic variant. Our recent work has shown that phenotypes sharing genetic and environmental underpinnings can be leveraged in multi-phenotype analyses to increase statistical power to detect associated genetic loci. Diseases showing heterogeneity and/or evidence for subtypes that can be partially characterized by endophenotypes and biomarkers, including several autoimmune diseases such as the Sjgren's syndrome (SS), are particularly good candidates for multi- phenotype methods. In this proposal we aim to apply two new multi-phenotype methods for the analysis of over 50 SS related phenotypes from the Sjgren's International Collaborative Clinical Alliance (SICCA) study. SICCA has generated a unique collection of SS case/control and SS related phenotypes along genome-wide genotypes data among more than 3,500 individuals. The first proposed approach is an extension of the multivariate method based on a principal component analysis framework that we recently developed. Unlike standard univariate approaches, our method is capable of detecting associations even when there exist multiple genetically heterogeneous subphenotypes of the disease. It is based on composite null hypothesis (all phenotypes are tested jointly), so that single phenotype-genotype association cannot be established. This limitation is the cost for dramatic increase in statistical power to identify genetic variant with positive and negative pleiotropic effect (concordant and discordant genetic effect respectively). The second approach relies on a new and innovative strategy that will be developed as part of this proposal. As oppose to multivariate methods, this approach keeps the univariate properties of determining association between a single outcome and a single genetic variant, but as in multivariate approaches, it leverages correlation with other available phenotypes. Using the proposed approaches we expect to dramatically increase our ability to identify genetic variants associated with SS phenotypes. We fully expect that these two methods will reveal important insights into the genetic basis of SS and will go on to serve the broader the genetics community.

Public Health Relevance

Researchers have now commonly access to both genome-wide genotyping data and large number of disease- related traits and biomarkers. While these extensive datasets might contain keys to understand the genetic architecture of human traits and diseases, the statistical tools to analyze them are severely lacking. In this research proposal we aim at developing innovative analytical strategy for genome-wide association study of a large collection of Sjgren's Syndrome subphenotypes.

Agency
National Institute of Health (NIH)
Institute
National Institute of Dental & Craniofacial Research (NIDCR)
Type
Small Research Grants (R03)
Project #
5R03DE025665-02
Application #
9344575
Study Section
Special Emphasis Panel (ZDE1)
Program Officer
Wang, Lu
Project Start
2016-09-02
Project End
2018-08-31
Budget Start
2017-09-01
Budget End
2018-08-31
Support Year
2
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Harvard University
Department
Public Health & Prev Medicine
Type
Schools of Public Health
DUNS #
149617367
City
Boston
State
MA
Country
United States
Zip Code
02115
Kang, Hyun Min; Subramaniam, Meena; Targ, Sasha et al. (2018) Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat Biotechnol 36:89-94
Mangul, Serghei; Yang, Harry Taegyun; Strauli, Nicolas et al. (2018) ROP: dumpster diving in RNA-sequencing to find the source of 1 trillion reads across diverse adult human tissues. Genome Biol 19:36
Aschard, Hugues; Guillemot, Vincent; Vilhjalmsson, Bjarni et al. (2017) Covariate selection for association screening in multiphenotype genetic studies. Nat Genet 49:1789-1795
Rahmani, Elior; Zaitlen, Noah; Baran, Yael et al. (2017) Correcting for cell-type heterogeneity in DNA methylation: a comprehensive evaluation. Nat Methods 14:218-219