The project consists of two parts which are motivated by and applicable to educational assessment and health sciences. Advances in modern computer and information technology enable educational assessments to measure comprehensive problem-solving skills in virtual environments in which examinees experience interactively with computers. The existing evaluation methods only look at the final answers, ignoring vast behavioral data collected over the course of interaction. The first part of the research explores the entire interactive problem-solving processes by individuals so that comprehensive problem-solving skills can be assessed efficiently and more accurately. The developed new tools will have direct impacts on the design and analysis of large scale national and international educational assessments such as the National Assessment of Educational Progress (NAEP) and the Programme for International Student Assessment (PISA), which are the two most important assessment schemes on the primary and secondary education. The second part develops novel statistical approaches to analyzing large scale health system data. The new developments could be used to ascertain efficacy and monitor side effects for drugs currently used in healthcare management programs. They could also lead to new statistical tools for analyzing behavioral data, which are common in social science studies. The project provides research training opportunities for graduate students.

The research develops latent variable models for moderately high dimensional counting process data and dynamic regression models for counting process data when both covariates and events are sparse. For latent variable/factor models, the research addresses the fundamental and challenging issue of identifiability by finding suitable constraints, which also lead to more parsimonious and interpretable models. Valid inferential methods are developed by establishing crucial asymptotic results under appropriate regularity conditions. Stochastic gradient-based algorithms are constructed for efficiently carrying out parameter estimation. For the multidimensional counting process models with frailty and dynamic covariates, the research addresses the challenging issue of sparsity, in terms of both events and covariates. By exploring certain special structures inherent in such data, the research establishes suitably normalized asymptotic theories for parameter estimation so that valid inference can be conducted. The covariate sparsity and correlated frailty make the asymptotic theory challenging as standard techniques used for counting process models are no longer appropriate.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
2015417
Program Officer
Huixia Wang
Project Start
Project End
Budget Start
2020-07-01
Budget End
2023-06-30
Support Year
Fiscal Year
2020
Total Cost
$200,000
Indirect Cost
Name
Columbia University
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10027