The onset of most human disease involves numerous molecular-level changes to the complex system of interacting genes and pathways that function differently in specific cell-lineage, pathway, and treatment contexts. This system is probed by thousands of functional genomics and quantitative genetic studies, and integrative analysis of these data can generate testable hypotheses identifying causal genetic variants and linking them to network level changes in cells to disease phenotypes. This can enable deeper molecular-level understanding of pathophysiology, paving the way to genome-based precision medicine. The long term goal of this project is to enable such discoveries through integrative analysis of high- throughput biological data in a disease context. In the previous funding periods, we developed accurate data integration methods, created algorithms for the prediction of disease genes through context-specific and mechanistic network models and analysis of quantitative genetics data, and made novel insights into important biological processes and diseases. We further enabled experimental biological discovery by building public interactive systems capable of real-time user-driven integration that are popular among experimental biologists. We now propose to connect these gene-level functional network approaches with the underlying genomic variation by deciphering how genomic variants lead to specific transcriptional and posttranscriptional effects. We propose to develop ab initio sequence-level models capable of predicting biochemical effects of any genomic variant (including rare or never observed) on chromatin state and RNA regulation, then link these effects with gene-level regulatory consequences (including tissue-specific transcription and RNA splicing), and finally put genomic sequence directly into the network context via a statistical approach for detecting genes and network neighborhoods with a significantly elevated mutational burden in disease. Our key deliverable will be a user- friendly, interactive web-based framework enabling systems-level variant impact analysis in a network context and an open source library for computational scientists. In addition to systematic analysis across contexts and diseases, we will collaborate with experimentalists to apply our methods to Alzheimer?s, autism spectrum disorders, chronic kidney disease, immune diseases, and congenital heart defects as case studies for the iterative improvement of our methods and to directly contribute to better understanding of these diseases.
To pave the way for mechanistic interpretation of disease in the genomic context and eventually, precision medicine, we will develop algorithms for de novo prediction of functional biochemical effects of noncoding variants at the DNA regulation and RNA processing levels and then build frameworks for sequence-based prediction of tissue-specific transcription and post-transcriptional RNA processes (starting with splicing). To facilitate discovery of disease mechanisms, we will develop approaches for analyzing these variant effects in a network context, including those developed in the previous grant period (mechanistic and functional networks) and novel network models that integrate exon usage information or enhancer-gene interactions. In addition to verifying top predictions experimentally in our group or by our collaborators in case study areas of neurodegenerative disease, chronic kidney disease, ASD, and congenital heart disease, we will make our methods available to the broader biomedical community through public, interactive user interfaces and open source libraries.
Showing the most recent 10 out of 67 publications