The patient-reported outcome (PRO), representing the status of the patient's health that comes directly from the patient without interpretation by the clinician or anyone else, has the unique feature of describing health status from the viewpoint of the patient; therefore, the PRO research holds great promise for informed clinical and policy decision-making, as well as for improving the quality and efficiency of healthcare. However, the quality and value of PRO is contingent on a number of factors, and one of them is the missing-not-at-random (MNAR) issue. For instance, patients might fail to fill in a depression survey because of their level of depression, or patients who are sicker may be less likely to complete a quality-of-life questionnaire. In general, these PROs are missing due to the patient's declining health status, but the extent of decline is not known because it is not observed; hence, these missing data are informative and are MNAR. Similar situations also appear in large-scale health surveys and electronic health records database. In this project, the PIs will study statistical methodology and computational algorithm for the MNAR issue in PRO as well as in other similar situations. The research product has the potential to be applied to various studies, such as Alzheimer's disease, mental health disorders, orthopedics, and pain research. The PIs will also engage in education at both disciplinary and interdisciplinary levels, with beneficiaries ranging from local high school students and undergraduates, to master and PhD students, and to biomedical investigators. The project will also provide research opportunities for postdoctoral scholars.

The overarching goal of this project is to establish a groundbreaking and translational statistical methodology framework including robust methods as well as efficient estimators, where the assumption on the missing data mechanism is imposed at a minimum level hence the developed methods can be applied with the largest flexibility. Motivated by the well-recognized fact that there is no adequate way to test the correctness of the missing data mechanism, the PIs will adopt the shadow variable approach to achieve the model identification and essentially make no further assumptions on the mechanism, thereby provide largest possible protection to model misspecification. The methodology is robust against the mechanism model misspecification by leveraging the model-based likelihood and its associated semiparametric structure. The statistical methods developed in this project will be implemented into efficient R packages and user-friendly interfaces for researchers whose primary goal is the analysis of missing data, especially MNAR data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1953526
Program Officer
Pedro Embid
Project Start
Project End
Budget Start
2020-08-01
Budget End
2021-03-31
Support Year
Fiscal Year
2019
Total Cost
$366,166
Indirect Cost
Name
Suny at Buffalo
Department
Type
DUNS #
City
Buffalo
State
NY
Country
United States
Zip Code
14228