Although genome-wide association studies (GWAS) have been extremely successful in identifying numerous risk loci for complex traits and diseases, at the vast majority of these loci, the causal mechanism between genetic variation and disease risk remains largely unknown. This prohibits the development of novel drug targets, personalized treatments or accurate prediction of high-risk individuals. In the quest to address this gap, post-GWAS studies are experiencing a ?big data? revolution driven by the exponentially decreasing costs of high-throughput genomic assays. Multiple layers of data (genetic variation, transcriptome levels, epigenetic modifications, localization of tissue-specific regulatory sites, etc.) are routinely collected in increasingly large cohorts of individuals. This raises the need for new computational and statistical methods that are able to integrate various types of data (genetic, epigenetic, transcriptomic) to understand the causal mechanism of disease at GWAS risk loci. Here we propose to develop new methods and techniques and to apply them to gain insights to the genetic basis of common disease and traits. Importantly, we aim to circumvent genomic privacy issues (that often prohibit access to large-scale GWAS data) by proposing techniques that operate directly at the summary statistic level (e.g. variant effect sizes). We will apply existing and newly developed methods on GWAS summary data sets over 30 traits and diseases spanning more than 1,000,000 phenotype measurements, joint with a catalogue of over 7,000 biochemical and evolutionary genetic metrics of functionality as well as over 10,000 individuals for which genetic variation, gene expression and disease status has been measured.
Genetic studies of common diseases are experiencing a experiencing a ?big data? revolution driven by the exponentially decreasing costs of high-throughput genomic assays. Multiple layers of data (genetic variation, gene expression levels, localization of tissue-specific regulatory sites, etc.) are routinely collected in increasingly large cohorts of individuals, raising the need for new computational and statistical methods. In this proposal we will develop new techniques and apply them to large-scale empirical data to gain insights into genetic basis of common disease and traits.
|Giambartolomei, Claudia; Zhenli Liu, Jimmy; Zhang, Wen et al. (2018) A Bayesian framework for multiple trait colocalization from summary association statistics. Bioinformatics 34:2538-2545|
|Johnson, Ruth; Shi, Huwenbo; Pasaniuc, Bogdan et al. (2018) A unifying framework for joint trait analysis under a non-infinitesimal model. Bioinformatics 34:i195-i201|
|Gusev, Alexander; Mancuso, Nicholas; Won, Hyejung et al. (2018) Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat Genet 50:538-548|
|Barfield, Richard; Feng, Helian; Gusev, Alexander et al. (2018) Transcriptome-wide association studies accounting for colocalization using Egger regression. Genet Epidemiol 42:418-433|
|Franceschini, Nora; Giambartolomei, Claudia; de Vries, Paul S et al. (2018) GWAS and colocalization analyses implicate carotid intima-media thickness and carotid plaque loci in cardiovascular outcomes. Nat Commun 9:5141|
|Roytman, Megan; Kichaev, Gleb; Gusev, Alexander et al. (2018) Methods for fine-mapping with chromatin and expression data. PLoS Genet 14:e1007240|
|Mancuso, Nicholas; Gayther, Simon; Gusev, Alexander et al. (2018) Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat Commun 9:4079|
|Fejzo, Marlena S; Sazonova, Olga V; Sathirapongsasuti, J Fah et al. (2018) Placenta and appetite genes GDF15 and IGFBP7 are associated with hyperemesis gravidarum. Nat Commun 9:1178|
|Mancuso, Nicholas; Shi, Huwenbo; Goddard, Pagé et al. (2017) Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits. Am J Hum Genet 100:473-487|
|Brown, Robert; Kichaev, Gleb; Mancuso, Nicholas et al. (2017) Enhanced methods to detect haplotypic effects on gene expression. Bioinformatics 33:2307-2313|
Showing the most recent 10 out of 11 publications