Statisticians play in a key role in quantifying the uncertainty in ?ndings from clinical trials, observational studies, and other data sources, thereby enabling rational decision making based on the ?ndings and protecting against the human tendency to see signal in noise. Ideally the validity of the uncertainty quanti?cation, also known as statistical inference, will be agnostic in the sense that it avoids reliance on implausible assumptions, thereby improving the interpretability and credibility of the resulting ?nd- ings. There has been much progress over the last several decades in obtaining agnostic inference of population-level quantities, such as (i) the percent reduction in infection in vaccinees compared to place- bos or (ii) the average treatment effect if the entire population receives treatment versus control. There has been relatively little progress, on the other hand, in obtaining agnostic inference of higher-resolution quantities, such as (i) the percent reduction in the probability of infection in vaccinees compared to place- bos, conditional on a continuous measure of immune response or (ii) the average treatment effect for an individual based on a high-dimensional set of observed covariates. These high-resolution quantities are function-valued in the sense that they are respectively de?ned as functions of immune response and subject covariates. While inference for high-resolution quantities can be obtained when one is willing to commit to strong assumptions on the observed data distribution, these methods are plagued by the same de?ciencies of poor interpretability and damaged credibility faced by non-agnostic inferential procedures for population-valued quantities. In contrast to the limited progress on obtaining inference for more re?ned quantities, there has been considerable progress towards obtaining inference-free point estimates of (i) the percent reduction in infection probability conditional on immune response markers and (ii) the average treatment effect conditional on covariates. This progress has come from several ?elds, including statistics, machine learning, and computer science. This proposal outlines a uni?ed methodology for obtaining inference for high-resolution quantities, where the proposal draws inspiration from a framework developed for population-level quantities, namely targeted minimum loss-based estimation, but features highly original developments that enable inference for higher-resolution, function-valued quantities. This work has the potential to make a major impact on the identi?cation of individual-level variables that correlate with vaccine ef?cacy, including in identifying individuals who will be harmed by the only licensed dengue vaccine and in identify vaccine-induced immune responses that correlate with prevention in two ongoing HIV vaccine ef?cacy trials. The potential for broad impacts to other areas of biomedical research, including precision medicine, are also described.

Public Health Relevance

The statistics and machine learning communities have made remarkable progress in developing highly ?exible data mining procedures that, for example, enable (i) the identi?cation of potentially protective vaccine-induced immune responses and (ii) the prediction of the optimal treatment for a given individual. Despite this progress, there has been little progress in quantifying the uncertainty in the output of these ?exible procedures without making implausible assumptions on the mechanism that generated the data, where the likely failure of these assumptions renders it dif?cult to make (i) informed vaccine development decisions based on the identi?ed immune response or (ii) treatment decisions based on the predicted optimal treatment. The proposed work develops a uni?ed methodology for quantifying the uncertainty in estimates obtained using these data mining techniques, thereby enabling informed decision making.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
NIH Director’s New Innovator Awards (DP2)
Project #
1DP2LM013340-01
Application #
9774603
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Ye, Jane
Project Start
2019-08-20
Project End
2024-05-31
Budget Start
2019-08-20
Budget End
2024-05-31
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of Washington
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195