The goal of the Data Analysis Unit is to develop computational tools, pipelines and web portals to enable utilization of the spatial and temporal multi-omic molecular characterization of lung premalignant lesions (PML) to build a Lung Pre-Cancer Atlas (Lung PCA). The Lung PCA will serve as a resource to gain insights into: (1) spatial, temporal and molecular interactions between PML cells, the microenvironment, and the immune system; (2) intra-lesional heterogeneity and clonal evolution; and (3) inter-lesional heterogeneity and its effects on spatial and temporal PML molecular and cellular dynamics. The motivation to create the Lung PCA stems from the hypothesis that with successful lung cancer interception, lung cancer will cease to be the most common cause of cancer-associated mortality. This ability to detect and treat incipient lung cancer will be enabled by molecular diagnostics that determine lung cancer risk and therapeutic strategies to reduce risk. It is in this context that Lung PCA will play a foundational role, as it will provide a deeper understanding of the molecular pathways by which lung carcinogenesis either proceeds or fails. The Lung PCA will incorporate clinical data, pathologic data, genomic data, transcriptomic data, and spatially resolved data about gene and protein expression. Applying tools to integrate and interrogate these data both cross sectionally and over space and time, will allow the Lung PCA to be used as a research environment to answer clinically relevant questions about lung PML development. To demonstrate the utility of the Lung PCA, we will use the Atlas to discover biomarkers of PML malignant progression. The Data Analysis Unit will develop the Lung PCA as a web-based resource by extending the current functionality of the cBioPortal web-application for visualization, analysis and download of large-scale cancer genomics data. This will involve creating data processing, QC, integration and analysis pipelines to enable consistent data processing and analysis. Similar to the TCGA, the Lung PCA will not only make all raw and processed data freely available for download and local analysis, but also provide integrative web-based analysis tools to enable analyses without a requirement for local computational infrastructure or advanced computational biology expertise. We have adopted a scalable and modular approach that will allow the Atlas to grow and adapt in response to insights, methods and approaches from other Centers within the HTA Network and the broader clinical and research communities. It will also allow for the Atlas framework and its components to be shared with other groups in and outside of the HTAN. The Data Analysis Unit will be co-led by Dr.'s Cerami, Getz, Leshchiner and Lenburg who have an established track-record of success in cancer informatics, bioinformatics, computational biology, biomarker discovery, and translational science; and considerable experience leading large-scale data analysis efforts in a number of large consortia (e.g. TCGA, ICGC, GTEx, EDRN, etc.).

National Institute of Health (NIH)
National Cancer Institute (NCI)
Resource-Related Research Multi-Component Projects and Centers Cooperative Agreements (U2C)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Boston University
United States
Zip Code