Although genome-wide association studies (GWAS) have been extremely successful in identifying numerous risk loci for complex traits and diseases, at the vast majority of these loci, the causal mechanism between genetic variation and disease risk remains largely unknown. This prohibits the development of novel drug targets, personalized treatments or accurate prediction of high-risk individuals. In the quest to address this gap, post-GWAS studies are experiencing a ?big data? revolution driven by the exponentially decreasing costs of high-throughput genomic assays. Multiple layers of data (genetic variation, transcriptome levels, epigenetic modifications, localization of tissue-specific regulatory sites, etc.) are routinely collected in increasingly large cohorts of individuals. This raises the need for new computational and statistical methods that are able to integrate various types of data (genetic, epigenetic, transcriptomic) to understand the causal mechanism of disease at GWAS risk loci. Here we propose to develop new methods and techniques and to apply them to gain insights to the genetic basis of common disease and traits. Importantly, we aim to circumvent genomic privacy issues (that often prohibit access to large-scale GWAS data) by proposing techniques that operate directly at the summary statistic level (e.g. variant effect sizes). We will apply existing and newly developed methods on GWAS summary data sets over 30 traits and diseases spanning more than 1,000,000 phenotype measurements, joint with a catalogue of over 7,000 biochemical and evolutionary genetic metrics of functionality as well as over 10,000 individuals for which genetic variation, gene expression and disease status has been measured.

Public Health Relevance

Genetic studies of common diseases are experiencing a experiencing a ?big data? revolution driven by the exponentially decreasing costs of high-throughput genomic assays. Multiple layers of data (genetic variation, gene expression levels, localization of tissue-specific regulatory sites, etc.) are routinely collected in increasingly large cohorts of individuals, raising the need for new computational and statistical methods. In this proposal we will develop new techniques and apply them to large-scale empirical data to gain insights into genetic basis of common disease and traits.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG009120-04
Application #
9881332
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Li, Rongling
Project Start
2017-03-01
Project End
2022-02-28
Budget Start
2020-03-01
Budget End
2021-02-28
Support Year
4
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of California Los Angeles
Department
Pathology
Type
Schools of Medicine
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Giambartolomei, Claudia; Zhenli Liu, Jimmy; Zhang, Wen et al. (2018) A Bayesian framework for multiple trait colocalization from summary association statistics. Bioinformatics 34:2538-2545
Johnson, Ruth; Shi, Huwenbo; Pasaniuc, Bogdan et al. (2018) A unifying framework for joint trait analysis under a non-infinitesimal model. Bioinformatics 34:i195-i201
Gusev, Alexander; Mancuso, Nicholas; Won, Hyejung et al. (2018) Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat Genet 50:538-548
Barfield, Richard; Feng, Helian; Gusev, Alexander et al. (2018) Transcriptome-wide association studies accounting for colocalization using Egger regression. Genet Epidemiol 42:418-433
Franceschini, Nora; Giambartolomei, Claudia; de Vries, Paul S et al. (2018) GWAS and colocalization analyses implicate carotid intima-media thickness and carotid plaque loci in cardiovascular outcomes. Nat Commun 9:5141
Roytman, Megan; Kichaev, Gleb; Gusev, Alexander et al. (2018) Methods for fine-mapping with chromatin and expression data. PLoS Genet 14:e1007240
Mancuso, Nicholas; Gayther, Simon; Gusev, Alexander et al. (2018) Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat Commun 9:4079
Fejzo, Marlena S; Sazonova, Olga V; Sathirapongsasuti, J Fah et al. (2018) Placenta and appetite genes GDF15 and IGFBP7 are associated with hyperemesis gravidarum. Nat Commun 9:1178
Shi, Huwenbo; Mancuso, Nicholas; Spendlove, Sarah et al. (2017) Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits. Am J Hum Genet 101:737-751
Mancuso, Nicholas; Shi, Huwenbo; Goddard, Pagé et al. (2017) Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet 100:473-487

Showing the most recent 10 out of 11 publications