Sample selection is a pernicious source of potential bias known to equally plague randomized and observational studies in the health sciences. Selection bias is said to be present in a study, if in the observed sample, features of the underlying population of primary scientific interest, are entangled with features of the selection process not of scientific interest, so that naive inferences may be inaccurate and possibly misleading. The proposal aims to study two leading causes of selection bias, (i) outcome missing not at random in regression analysis, and (ii) unobserved outcome due to truncation by death. The main goal is to clarify the main distinguishing features of (i) and (ii), and to develop novel methodology to tame selection bias for each of these settings. The methods for (i) will be used to make inferences about HIV sero-prevalence in Botswana based on a nationally representative household survey subject to substantial (>40%) HIV testing refusal by household members. The methods for (ii) will be used to obtain inferences about the effects of maternal HIV status on outcomes typically only observed for live births, such as low birth weight, in the presence of non-trivial rates of still birth occurrence in a study conducted in Botswana.

Public Health Relevance

Sample selection is a potential threat to the validity of randomized and observational studies in the health sciences. Selection bias can arise due to an outcome missing not at random, sometimes due to death, in which case valid inference can often not be obtained without an additional assumption. In this proposal, we propose instrumental variable type techniques to account for selection bias due to certain extreme forms of missing data encountered often in the health sciences, with an emphasis on HIV research.

National Institute of Health (NIH)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Gezmu, Misrak
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
Public Health & Prev Medicine
Schools of Public Health
United States
Zip Code