Wildland ?re smoke is a major contributor to air pollution in the United States (US) and is associated with a wide range of health risks. The number and intensity of wildland ?res are expected to increase with a changing climate; therefore, there is a pressing need to accurately quantify the extent to which wildland ?re smoke contributes to air pollution levels and corresponding health burden, and to evaluate the effectiveness of preventative measures to mitigate the health burden. However, this work presents many challenges. Exposure to wildland ?res clearly can- not be randomized, so we rely on spatially-correlated observational data and causal inference. While there is an impressive literature on causal inference for independent data, the methods available for spatial data are limited. Progress in the spatial setting has been slow due to complexities induced by spatial correlations and interference, i.e., the effect of treatment at one location depends on the response at nearby locations. We also analyze data from Smoke Sense, an Environmental Protection Agency (EPA)-sponsored citizen science project designed to engage citizens that experience the effects of ?re smoke using smart-phone applications (app). Citizen science studies have transformative potential to amass valuable data and engage the public in scienti?c research, but can be plagued by self selection of treatment and complex missing data patterns. The overarching theme of the proposal is to develop a suite of casual analysis tools to analyze observational spatial data and data aris- ing from smart-phone applications, handling interference, spatially-varying treatment effects, informative missingness and spatial unmeasured confounders.
In Aim 1, we provide a new formulation of spatial interfer- ence using kernel distance functions. We extend marginal structural models and structural nested mean models to the setting with spatial interference and propose doubly-robust estimators of direct and indirect/spillover effects. We will apply this new method to estimate wildland ?re smoke effects on air pollution levels and health burden. Because of subject heterogeneity in response to treatment, it is desirable to develop personalized recommenda- tion strategies to determine which treatment works best, for whom, and under what circumstances.
In Aim 2 we propose a novel causal model that describes how treatment effects vary over space and with evolving subject characteristics. Using the Smoke Sense data, we will estimate heterogeneous effects of app engagement and preventative measures to mitigate the impact of wildland ?re smoke. We also propose an instrumental variable approach to handling informative missingness, which arises frequently in studies with smart phone applications and can lead to invalid inference if not properly addressed.
In Aim 3 we build on our previous work to adjust for missing spatial confounders by modeling the relationship between the treatment and the missing confounders in the spectral domain and establishing conditions on their coherence that permit estimation of the treatment effect. The methods will be disseminated using freely-available software and examined over a range of applications. Therefore, the results of this project will have a broad impact on future environmental health studies.
Causal inference theory and methods are indispensable to biostatistics, but due to complications such as spatial correlation and spillover effects, methods applicable to spatial data are limited. In this project, we develop new causal methods to allow researchers to extract knowledge from observational spatial data. The methods are developed in the context of two large epidemiological studies of the effects of wildland ?re smoke and preventative measures on air pollution exposure and respiratory health outcomes.