In epidemiology, clinical research, and the social sciences, inferences about the causal effects of treatments and risk factors are used to design more effective interventions. This project focuses on the development of statistical methods for causal inference. The first part of this project will develop a causal inference method for mediation analysis. If a treatment has a beneficial effect on an outcome, it is often of interest to investigate what are the pathways by which it affects the outcome. Direct and indirect effects decompose the effect of a treatment into the part that is mediated by a covariate (the mediator) and the part that is not. For example, in HIV/ AIDS research, it is important to estimate how much of the effect of Antiretroviral Therapy (ART) on mother-to-child-transmission of HIV is mediated by the effect of ART treatment on the HIV viral load in the mother's blood. In medicine, psychology, political science, and economics, differentiating between indirect and direct effects has become increasingly important. Therefore, it is paramount that appropriate statistical methods are developed to estimate direct and indirect effects in a variety of settings, including the setting in which there are post-treatment common causes of the mediator and the outcome. The second part of this project will compare confidence regions. Recently, there has been extensive discussion in the statistical community about a move away from p-values. P-values can lead researchers to conclude that a treatment has a significant effect even if that effect is very small, and clinically irrelevant. Confidence regions are the obvious alternative to p-values, as they provide a range of values of the parameters of interest that are most consistent with the data. While comparisons of p-values have been extensively researched and confidence regions are routinely reported, comparison of confidence regions has received relatively little attention. In this project, confidence regions will be compared based on the notion of asymptotic equivalence.
Natural direct and indirect effects use cross-worlds counterfactuals: outcomes under treatment with the mediator "set" to its value without treatment. Cross-worlds counterfactuals can never be observed, as they involve quantities under two different treatments where only one treatment is given to any particular patient or unit. The PI has recently proposed organic direct and indirect effects to avoid the use of cross-worlds counterfactuals. Organic direct and indirect effects also apply when the mediator cannot be "set". For example, the HIV viral load in the mother's blood cannot be set; if it could be set, doctors would set it to zero. In the first part of this project, organic direct and indirect effects will be extended to settings with post-treatment common causes of the mediator and the outcome. It will be shown that, in contrast to natural direct and indirect effects, estimators and confidence intervals can be developed in that setting for organic effects. The second part of this project will compare confidence regions. Most work on the comparison of confidence regions has studied coverage probabilities, confidence interval length, and small sample properties. In this project, confidence regions will be compared for large samples, based on the asymptotic behavior of the Hausdorff distance between the different confidence regions. The Hausdorff distance between partly overlapping intervals is simply the maximum of the difference between the left limits and the right limits of the intervals. The Hausdorff distance has also been defined for non-convex sets and in higher dimensions.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.