Sample selection is a pernicious source of potential bias known to equally plague randomized and observational studies in the health sciences. Selection bias is said to be present in a study, if in the observed sample, features of the underlying population of primary scientific interest, are entangled with features of the selection process not of scientific interest, so that naive inferences may be inaccurate and possibly misleading. The proposal aims to study two leading causes of selection bias, (i) outcome missing not at random in regression analysis, and (ii) unobserved outcome due to truncation by death. The main goal is to clarify the main distinguishing features of (i) and (ii), and to develop novel methodology to tame selection bias for each of these settings. The methods for (i) will be used to make inferences about HIV sero-prevalence in Botswana based on a nationally representative household survey subject to substantial (>40%) HIV testing refusal by household members. The methods for (ii) will be used to obtain inferences about the effects of maternal HIV status on outcomes typically only observed for live births, such as low birth weight, in the presence of non-trivial rates of still birth occurrence in a study conducted in Botswana.

Public Health Relevance

Sample selection is a potential threat to the validity of randomized and observational studies in the health sciences. Selection bias can arise due to an outcome missing not at random, sometimes due to death, in which case valid inference can often not be obtained without an additional assumption. In this proposal, we propose instrumental variable type techniques to account for selection bias due to certain extreme forms of missing data encountered often in the health sciences, with an emphasis on HIV research.

National Institute of Health (NIH)
National Institute of Allergy and Infectious Diseases (NIAID)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Gezmu, Misrak
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
Public Health & Prev Medicine
Schools of Public Health
United States
Zip Code
Prague, Melanie; Wang, Rui; Stephens, Alisa et al. (2016) Accounting for interactions and complex inter-subject dependency in estimating treatment effect in cluster-randomized trials with missing outcomes. Biometrics 72:1066-1077
Nguyen, Thu T; Tchetgen Tchetgen, Eric J; Kawachi, Ichiro et al. (2016) Instrumental variable approaches to identifying the causal effect of educational attainment on dementia risk. Ann Epidemiol 26:71-6.e1-3
Nguyen, Thu T; Tchetgen Tchetgen, Eric J; Kawachi, Ichiro et al. (2016) Comparing Alternative Effect Decomposition Methods: The Role of Literacy in Mediating Educational Effects on Mortality. Epidemiology 27:670-6
Mayeda, Elizabeth Rose; Tchetgen Tchetgen, Eric J; Power, Melinda C et al. (2016) A Simulation Platform for Quantifying Survival Bias: An Application to Research on Determinants of Cognitive Decline. Am J Epidemiol 184:378-87
Tchetgen Tchetgen, Eric J; Phiri, Kelesitse; Shapiro, Roger (2015) A Simple Regression-based Approach to Account for Survival Bias in Birth Outcomes Research. Epidemiology 26:473-80
Naimi, Ashley I; Tchetgen Tchetgen, Eric J (2015) Invited commentary: Estimating population impact in the presence of competing events. Am J Epidemiol 181:571-4
VanderWeele, Tyler J; Tchetgen Tchetgen, Eric J; Halloran, M Elizabeth (2014) Interference and Sensitivity Analysis. Stat Sci 29:687-706
Tchetgen Tchetgen, Eric J (2014) Identification and estimation of survivor average causal effects. Stat Med 33:3601-28