The investigator develops regression tree solutions for some important problems with complex and high-dimensional data. Complexity includes missingness, censoring, mixed variable types, and correlated measurements taken at random time points. One specific problem is identification of subgroups for differential treatment effects in comparative trials involving time-varying covariates. A second problem is importance scoring and thresholding of variables and a third is detection of differential test item functioning in testing and evaluation. The main approach relies on adapting and extending the GUIDE decision tree algorithm to these problems. Expected difficulties and challenges include minimizing error rates and computational cost as well as ensuring unbiased selection of the variables used to split the nodes of the trees.

The ability to collect and generate greater amounts of data at faster speeds creates new difficulties to data analysis and interpretation. For example, the health industry is looking into using genetic information and repeated observations over time to find personalized treatments for diseases. The proposed research will extend a statistical approach based on decision trees to solve problems such as: (i) identifying subpopulations of patients who benefit more from a one treatment over another, based on repeated observations on health and other outcomes over time, (ii) identifying and ranking genetic and other variables with respect to their importance in prediction of illness and their interactions with treatments, and (iii) identifying test items in testing and evaluation that discriminates against people due to their gender, race, or socio-economic and cultural background. A decision tree model has the unique advantage that it is easy to apply and intuitive to interpret. The latter property is crucially important to understanding and advancing the science.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1305725
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2013-08-15
Budget End
2017-07-31
Support Year
Fiscal Year
2013
Total Cost
$130,000
Indirect Cost
Name
University of Wisconsin Madison
Department
Type
DUNS #
City
Madison
State
WI
Country
United States
Zip Code
53715