This is a proposal to develop statistical methods that deal with real-world complexities that commonly arise when mapping aggregated disease count data collected for administrative areas.
The specific aims are motivated by problems encountered in epidemiological studies designed for studying and monitoring health disparities, though they are also relevant for area-based studies of environmental effects. Our proposed methods address issues associated with administrative boundaries changing over time, sparse disease counts, spatial confounding, and the heavy computational burdens associated with the analysis of large data sets.
Specific aims of the project are to develop, evaluate, and implement 1) methods for handling boundary misalignment over time in disease mapping settings, (2) spatial regression models for area-specific disease count data exhibiting complex distribution patterns, (3) a theoretical framework and practical diagnostic strategies for assessing and minimizing bias from spatial confounding, (4) fast, memory-efficient algorithms for fitting standard spatio-temporal regression models, (5) efficient user-friendly algorithms and statistical software that implement these methods with the goal of disseminating them to health science researchers. The proposed methods will be applied to area-specific disease count data on U.S. breast cancer incidence, Boston-area premature mortality, Australian ischemic heart disease rates, and incidence and mortality data from the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) database. The methods will allow researchers to better estimate how rates of cancer and other outcomes vary geographically and over time, thereby aiding in the documentation, analysis, and ultimate reduction of health disparities in the United States, as defined as one of the overarching goals of Healthy People 2010 (US Department of Health and Human Services 2000). This project (Project 1) integrates very closely with the spatial surveillance Project 2: whereas Project 1 focuses on spatio-temporal modeling for the purpose of characterizing the impact of area-based measures of socioeconomic status or other demographic characteristics on cancer and other diseases, Project 2 focuses on identifying areas where disease rates are unusually high. Analysis of SEER data features prominently in both Projects 1 and 2. Projects 1 and 3 share the common theme of analyzing high-dimensional observational data on cancer. This project relies heavily on the Statistical Computing Core and will benefit from the organizational infrastructure, team building strategies, short-courses and visitor program provided through the Administrative Core.
Showing the most recent 10 out of 192 publications