Visualization of (Epi)Genomic Data for Discovery of Disease-Associated Variants

Gehlenborg, Nils

Abstract

This proposal combines an extensive mentored training program for the PI with a research project that aims to develop novel approaches for visualization and exploration that will accelerate the identification and validation of disease-associated variants in large and complex genomics and epigenomics data sets. An increasing number of such variants are discovered in studies that generate and analyze a wide range of molecular data types for thousands of patients or samples. This progress is enabled by the availability of computational analysis pipelines that employ sophisticated statistical methods for next-generation sequencing (NGS) data. Interpretation of analysis results by biological and clinical domain experts, however, is emerging as a major bottle- neck due to the amount and complexity of the pipeline outputs. To address this, we propose to develop inter- active visualization methods and a web-based infrastructure that will enable domain experts to identify disease-associated variants in large (epi)genomic data sets through visual exploration of computational predictions and the underlying data. This will have a significant impact on the rate at which predictions can be verified, interpreted and translated into clinically actionable finding. Our first priority is the design of methods and tools to visualize (epi) genomic data in a range of different contexts, for instance by grouping and representing features based on their function, chromatin state, transcriptional activity or genomic coordinates. We will also develop new non-linear genome representations to compare structural variants across genomes, complementing the functionality of the highly successful genome browsers. We then investigate how information external to the primary data - for instance from other studies, drug target or biomarker databases - can be applied to guide investigators through the data set. Finally, we implement a web-based exploration system for biological and clinical domain an expert that combines our interactive visualizations with large-scale public (epi) genomic data sets. The methods and tools developed under this proposal will be generally applicable and driving biological examples are chosen from The Cancer Genome Atlas (TCGA) and the Encyclopedia of DNA Elements (ENCODE and modENCODE).

Public Health Relevance

The visualization methods and tools developed under this proposal will accelerate the identification and verification of disease-associated variants in larg genomic and epigenomic data sets, thereby reducing the effort required to translate findings into clinically actionable results. Furthermore, under this proposal, the PI will acquire the skills required to be a productive independent investigator in the biomedical field through further mentored training in genomics and epigenomics as well as in research management.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Transition Award (R00)
Project #: 5R00HG007583-04
Application #: 9128459
Study Section: Special Emphasis Panel (NSS)
Program Officer: Gilchrist, Daniel A

Project Start: 2015-08-18
Project End: 2018-07-31
Budget Start: 2016-08-01
Budget End: 2017-07-31
Support Year: 4
Fiscal Year: 2016
Total Cost
Indirect Cost

Institution

Name: Harvard Medical School
Department: Miscellaneous
Type: Schools of Medicine
DUNS #: 047006379

City: Boston
State: MA
Country: United States
Zip Code

Related projects


NIH 2017 R00 HG	Visualization of (Epi)Genomic Data for Discovery of Disease-Associated Variants Gehlenborg, Nils / Harvard Medical School
NIH 2016 R00 HG	Visualization of (Epi)Genomic Data for Discovery of Disease-Associated Variants Gehlenborg, Nils / Harvard Medical School
NIH 2015 R00 HG	Visualization of (Epi)Genomic Data for Discovery of Disease-Associated Variants Gehlenborg, Nils / Harvard Medical School	$248,995

Publications

Lekschas, Fritz; Gehlenborg, Nils (2018) SATORI: a system for ontology-guided visual exploration of biomedical data repositories. Bioinformatics 34:1200-1207

Nobre, Carolina; Gehlenborg, Nils; Coon, Hilary et al. (2018) Lineage: Visualizing Multivariate Clinical Data in Genealogy Graphs. IEEE Trans Vis Comput Graph :

Lekschas, Fritz; Bach, Benjamin; Kerpedjiev, Peter et al. (2018) HiPiler: Visual Exploration of Large Genome Interaction Matrices with Interactive Small Multiples. IEEE Trans Vis Comput Graph 24:522-531

Kerpedjiev, Peter; Abdennur, Nezar; Lekschas, Fritz et al. (2018) HiGlass: web-based visual exploration and analysis of genome interaction maps. Genome Biol 19:125

Conway, Jake R; Lex, Alexander; Gehlenborg, Nils (2017) UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33:2938-2940

Kern, Michael; Lex, Alexander; Gehlenborg, Nils et al. (2017) Interactive visual exploration and refinement of cluster assignments. BMC Bioinformatics 18:406

Manrai, Arjun K; Patel, Chirag J; Gehlenborg, Nils et al. (2016) METHODS TO ENHANCE THE REPRODUCIBILITY OF PRECISION MEDICINE. Pac Symp Biocomput 21:180-182

Stitz, H; Luger, S; Streit, M et al. (2016) AVOCADO: Visualization of Workflow-Derived Data Provenance for Reproducible Biomedical Research. Comput Graph Forum 35:481-490

Comments

Be the first to comment on Nils Gehlenborg's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: