Although genome-wide association studies (GWAS) have been extremely successful in identifying numerous risk loci for complex traits and diseases, at the vast majority of these loci, the causal mechanism between genetic variation and disease risk remains largely unknown. This prohibits the development of novel drug targets, personalized treatments or accurate prediction of high-risk individuals. In the quest to address this gap, post-GWAS studies are experiencing a ?big data? revolution driven by the exponentially decreasing costs of high-throughput genomic assays. Multiple layers of data (genetic variation, transcriptome levels, epigenetic modifications, localization of tissue-specific regulatory sites, etc.) are routinely collected in increasingly large cohorts of individuals. This raises the need for new computational and statistical methods that are able to integrate various types of data (genetic, epigenetic, transcriptomic) to understand the causal mechanism of disease at GWAS risk loci. Here we propose to develop new methods and techniques and to apply them to gain insights to the genetic basis of common disease and traits. Importantly, we aim to circumvent genomic privacy issues (that often prohibit access to large-scale GWAS data) by proposing techniques that operate directly at the summary statistic level (e.g. variant effect sizes). We will apply existing and newly developed methods on GWAS summary data sets over 30 traits and diseases spanning more than 1,000,000 phenotype measurements, joint with a catalogue of over 7,000 biochemical and evolutionary genetic metrics of functionality as well as over 10,000 individuals for which genetic variation, gene expression and disease status has been measured.
Genetic studies of common diseases are experiencing a experiencing a ?big data? revolution driven by the exponentially decreasing costs of high-throughput genomic assays. Multiple layers of data (genetic variation, gene expression levels, localization of tissue-specific regulatory sites, etc.) are routinely collected in increasingly large cohorts of individuals, raising the need for new computational and statistical methods. In this proposal we will develop new techniques and apply them to large-scale empirical data to gain insights into genetic basis of common disease and traits.
Showing the most recent 10 out of 11 publications