The places people reside throughout their lives play an important role in their health and in their propensity to develop diseases such as cancer. However, the longitudinal spatiotemporal contexts of where people live are not commonly incorporated into cancer studies. Recent advances in information technology and ?big data? and associated analytic approaches have made it possible for cancer registries and researchers to capture residential histories at the population level. We propose to develop a large multi-dimensional database for cancer patients using multiple data sources to reconstruct their longitudinal residential and exposure histories, and to identify potential patient exposure profiles using data mining techniques guided by scientific evidence from the cancer epidemiology and environmental health literature. We will demonstrate the feasibility and identify advantages and challenges of such an approach by using mesothelioma as an example. We hypothesize that there are distinct spatiotemporal environmental exposure trajectories and exposure profiles among mesothelioma patients that can be identified using residential histories.
Our specific aims are:
Aim 1 : Develop an optimal algorithm to streamline the process of compiling, cleaning, verifying, and constructing the residential histories of mesothelioma patients diagnosed between 2011 and 2015 in New York, as reported to the New York State Cancer Registry (NYSCR), utilizing multiple commercial and governmental data sources;
Aim 2 : Develop an optimal algorithm to streamline the process of compiling, cleaning, verifying, and constructing the exposure history associated with each mesothelioma patient's residential history by leveraging exposure proxies at the individual residence level and area-level information associated with patient's residential addresses, utilizing multiple commercial and governmental data sources;
and Aim 3 : Visualize the spatiotemporal dynamics of patients' residential and exposure histories, and identify predictors of their exposure profiles, using advanced data mining techniques such as cluster analysis, latent class analysis, and network analysis. The proposal is innovative in both the methods for constructing the database and the analytical methods for uncovering important exposure profiles, such as critical exposure windows, environmental clusters/hotspots, and the relative contributions of exposures across space and time. To our knowledge, no similar database exists at present. The residential data compiled in this project will be permanently stored within the NYSCR to allow future use, the first such example by any cancer registry. The identified exposure phenotypes will contribute to better understanding of the role environmental exposure plays in mesothelioma disease development. The methods developed can be tested, scaled up, replicated by other states, and adopted to other cancers and non-cancer related conditions. This life-course perspective approach holds great potential for advancing cancer research as well as for routine cancer registry surveillance.

Public Health Relevance

The places people reside throughout their lives play an important role in their health and in their propensity to develop diseases such as cancer. We will develop a large multi-dimensional database for cancer patients using multiple data sources to reconstruct their longitudinal residential and exposure histories, and to identify potential patient exposure profiles using data mining techniques guided by scientific evidence from the cancer epidemiology and environmental health literature. The life-course perspective approach developed from this exploratory project can be scaled up and adopted for other cancers, and has wide applications in cancer research and cancer registry surveillance.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21CA235153-01
Application #
9650773
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Tatalovich, Zaria
Project Start
2019-09-20
Project End
2021-08-31
Budget Start
2019-09-20
Budget End
2020-08-31
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Icahn School of Medicine at Mount Sinai
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
078861598
City
New York
State
NY
Country
United States
Zip Code
10029