Visual Analysis of Genomic and Clinical Data from Large Patient Cohorts

Park, Peter

Abstract

Comprehensive large cohort studies that collect a wide variety of genomic, epigenomic and clinical data are increasingly commonplace in the life sciences. While large sample sizes are still limited to well-funded consortia, the continuous cost decrease of data acquisition will allow individual labs to create larger datasets with fewer resources and will make genomic data analysis for the diagnosis of patients feasible. While this opens unprecedented possibilities for understanding the molecular processes underlying many diseases, it also poses challenges, especially with respect to data analysis and data management. There is a high demand for better analysis and visualization methods to keep pace with the increasing amount of data. At the same time, these data acquisition methods will also revolutionize the discovery and diagnosis of rare diseases. The integration of genomics data with extensive patient records and large patient cohorts promises diagnosis and potentially treatment to those with rare or undiagnosed diseases. In this project we will create novel methods and provide unique software tools that will meet this significant demand. Our methods are a departure from existing visualization approaches that are typically focused on visualizing particular molecular and clinical data types while neglecting the context of a patient cohort. Our proposed approach is distinguished from previous work by taking into account these complex relationships between patients in a cohort. In addition, our approach is the first to integrate genomic data at all scales while supporting the interactive analysis, creation and refinement of patient subsets. We will address this challenge by (1) developing visualization techniques, deeply integrated with algorithmic support, to identify and characterize disease subtypes. Specifically, we will develop methods that will allow clinical and experimental investigators to go beyond analyzing simple relationships, creating the potential to reveal the less obvious and indirect molecular causes of many diseases. (2) We will create novel visualizations that employ algorithms to select and display important genomic characteristics and the patient's clinical history to study and diagnose rare diseases. (3) We will create a framework to support the development of web-based visual exploration tools, which we will use to create the visualizations for subtype and rare disease analysis. Additionally, we will also make this framework available for the community to use for other tools. This will allow future projects to produce visual analysis methods that scale to the challenges of big data with less engineering overhead. This project will be a close collaboration between a team of computational (epi) genomics and cancer researchers in the laboratory of the Principal Investigator Peter Park at the Harvard Medical School and data visualization experts in the laboratory of the Co-Investigator Hanspeter Pfister at the Harvard School of Engineering and Applied Sciences. This team possesses the unique combination of expertise that is required to successfully address the challenges that motivate this application.

Public Health Relevance

The ability of scientists and medical doctors to generate large amounts of genome-wide molecular measurements for patient samples has surpassed their ability to efficiently and comprehensively interpret these measurements with existing analysis tools. To address this challenge, we will develop new approaches for visualization and analysis, which will enable clinical and computational experts alike to jointly analyze multiple genomic and clinical data types both for individual patients and for cohorts of patients. These methods will support the discovery and characterization of new subtypes in diseases, as well as the diagnosis of patients who are suffering from rare or previously undescribed diseases, ultimately contributing to better therapies and prognoses.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 1U01CA198935-01
Application #: 8875824
Study Section: Special Emphasis Panel (ZRG1)
Program Officer: Miller, David J

Project Start: 2015-06-01
Project End: 2018-05-31
Budget Start: 2015-06-01
Budget End: 2016-05-31
Support Year: 1
Fiscal Year: 2015
Total Cost
Indirect Cost

Institution

Name: Harvard Medical School
Department: Miscellaneous
Type: Schools of Medicine
DUNS #: 047006379

City: Boston
State: MA
Country: United States
Zip Code

Related projects


NIH 2017 U01 CA	Visual Analysis of Genomic and Clinical Data from Large Patient Cohorts Park, Peter J. / Harvard Medical School	$489,729
NIH 2016 U01 CA	Visual Analysis of Genomic and Clinical Data from Large Patient Cohorts Park, Peter J. / Harvard Medical School
NIH 2015 U01 CA	Visual Analysis of Genomic and Clinical Data from Large Patient Cohorts Park, Peter J. / Harvard Medical School

Publications

Nobre, Carolina; Gehlenborg, Nils; Coon, Hilary et al. (2018) Lineage: Visualizing Multivariate Clinical Data in Genealogy Graphs. IEEE Trans Vis Comput Graph :

Conway, Jake R; Lex, Alexander; Gehlenborg, Nils (2017) UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33:2938-2940

Kerzner, E; Lex, A; Sigulinsky, C L et al. (2017) Graffinity: Visualizing Connectivity in Large Graphs. Comput Graph Forum 36:251-260

Kern, Michael; Lex, Alexander; Gehlenborg, Nils et al. (2017) Interactive visual exploration and refinement of cluster assignments. BMC Bioinformatics 18:406

Partl, C; Gratzl, S; Streit, M et al. (2016) Pathfinder: Visual Analysis of Paths in Graphs. Comput Graph Forum 35:71-80

Gratzl, S; Lex, A; Gehlenborg, N et al. (2016) From Visual Exploration to Storytelling and Back Again. Comput Graph Forum 35:491-500

Strobelt, Hendrik; Alsallakh, Bilal; Botros, Joseph et al. (2016) Vials: Visualizing Alternative Splicing of Genes. IEEE Trans Vis Comput Graph 22:399-408

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: