This competitive renewal will develop general methods for evaluating the comparative effectiveness of alter- native strategies for HIV prevention, treatment and care in Southern and Eastern Africa. Large cluster random- ized trials and global cohort collaborations generate longitudinal data on hundreds of thousands of patients in real world settings. These provide a tremendous resource for developing the practice-based evidence needed to maximize the impact of HIV prevention strategies and to improve healthcare delivery systems. Realizing this potential, however, demands innovations to the field of Targeted Learning for maximally unbiased and efficient estimation of statistical parameters, best approximating the causal effects of interest. First, improved methods for estimating the effects of patient responsive monitoring and treatment strategies must be developed. In the common settings of strong confounding and rare outcomes, current estimators suffer from bias, lack efficiency and have unreliable measures of uncertainty. Second, general causal models and identifiability assumptions must be developed for the joint effects of cluster and individual-level interventions over multiple time points. These models will account for interactions between individuals within clusters and potential contamination between clusters. This work will inform the optimal design for sampling clusters and measuring individuals within communities or clinics. Third, efficient and maximally unbiased estimators must be developed to evaluate the impact of cluster and individual-level interventions over multiple time points. Current methods are highly susceptible to bias and misleading inference due to model misspecification and due to the often incorrect assumption that the observed data represent n independent, identically distributed (i.i.d.) repetitions of an experiment. The developed methods will elucidate the pathways by which cluster- based interventions impact health, while remaining robust to the common challenges of sparsity, irregular and informative missingness, and few truly independent units (clusters) but potentially hundreds of thousands of conditionally independent units. These innovations are motivated by our collaborations with the International epidemiologic Databases to Evaluate AIDS (IeDEA) in Southern (PI Dr. Egger) and Eastern Africa (PI Dr. Yiannoutsos) and the Sustain- able East Africa Research in Community Health (SEARCH) consortium (PI Dr. Havlir), a cluster randomized trial to evaluate the community-wide benefits of ART initiation at all CD4 counts. The developed methods will be applied to these data sources to investigate (i) strategies for monitoring antiretroviral therapy (ART) and guiding switches to second line regimens, (ii) the direct and indirect effects of a community-based HIV prevention strategy and (iii) the impact of clinic-based programs for delivering HIV care. Finally, the resulting estimators will be implemented as publicly available software packages and teaching papers written to explain the methodology in a clear and rigorous manner.

Public Health Relevance

There is a pressing need to evaluate the optimal strategies for delivering HIV prevention, treatment and care in resource-limited settings. We will advance HIV implementation science by formally evaluating which study designs best address our causal questions, developing statistical methodology to learn as much as possible from the complex data generated, implementing the resulting methodology in publicly available software, and ap- plying it to two large clinical cohort consortiums in Southern and Eastern Africa and a large cluster randomized trial of antiretroviral-based HIV prevention in Kenya and Uganda.

National Institute of Health (NIH)
National Institute of Allergy and Infectious Diseases (NIAID)
Research Project (R01)
Project #
Application #
Study Section
AIDS Clinical Studies and Epidemiology Study Section (ACE)
Program Officer
Gezmu, Misrak
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Berkeley
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Zheng, Wenjing; Luo, Zhehui; van der Laan, Mark J (2018) Marginal Structural Models with Counterfactual Effect Modifiers. Int J Biostat 14:
Balzer, Laura B; Zheng, Wenjing; van der Laan, Mark J et al. (2018) A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure. Stat Methods Med Res :962280218774936
Benkeser, David; Ju, Cheng; Lendle, Sam et al. (2018) Online cross-validation-based ensemble learning. Stat Med 37:249-260
Luedtke, Alexander R; van der Laan, Mark J (2018) Parametric-rate inference for one-sided differentiable parameters. J Am Stat Assoc 113:780-788
Koss, Catherine A; Ayieko, James; Mwangwa, Florence et al. (2018) Early Adopters of Human Immunodeficiency Virus Preexposure Prophylaxis in a Population-based Combination Prevention Study in Rural Kenya and Uganda. Clin Infect Dis 67:1853-1860
Zheng, Wenjing; Balzer, Laura; van der Laan, Mark et al. (2018) Constrained binary classification using ensemble learning: an application to cost-efficient targeted PrEP strategies. Stat Med 37:261-279
Kreif, NoƩmi; Tran, Linh; Grieve, Richard et al. (2017) Estimating the Comparative Effectiveness of Feeding Interventions in the Pediatric Intensive Care Unit: A Demonstration of Longitudinal Targeted Maximum Likelihood Estimation. Am J Epidemiol 186:1370-1379
Petersen, Maya; Balzer, Laura; Kwarsiima, Dalsone et al. (2017) Association of Implementation of a Universal Testing and Treatment Intervention With HIV Diagnosis, Receipt of Antiretroviral Therapy, and Viral Suppression in East Africa. JAMA 317:2196-2206
Rudolph, Kara E; van der Laan, Mark J (2017) Robust estimation of encouragement-design intervention effects transported across sites. J R Stat Soc Series B Stat Methodol 79:1509-1525
Ju, Cheng; Wyss, Richard; Franklin, Jessica M et al. (2017) Collaborative-controlled LASSO for constructing propensity score-based estimators in high-dimensional data. Stat Methods Med Res :962280217744588

Showing the most recent 10 out of 104 publications