This is a project to develop new statistical methods for comparing groups of subjects in terms of health outcomes that are assessed using data from wearable devices. Inexpensive wearable sensors for health monitoring are now capable of generating massive amounts of data collected longitudinally, up to months at a time. The project will develop inferential methods that can deal with the complexity of such data. A serious challenge is the presence of unmeasured time-dependent confounders (e.g., circadian and dietary patterns), making direct comparisons or borrowing strength across subjects untenable unless the studies are carried out in controlled experimental con- ditions. Generic data mining and machine learning tools have been widely used to provide predictions of health status from such data. However, such tools cannot be used for signi?cance testing of covariate effects, which is necessary for designing precision medicine interventions, for example, without taking the inherent model selection or the presence of the unmeasured confounders into account. To overcome these dif?culties, a systematic de- velopment of inferential methods for functional outcome data obtained from wearable devices will be carried out. There are three speci?c aims: 1) Develop metrics for functional outcome data from wearable devices, 2) Develop nonparametric estimation and testing methods for activity pro?les and a screening method for predictors of activity pro?les, 3) Implement the methods in an R package and carry out two case studies using accelerometer data.
For Aim 1, the approach is to reduce the sensor data to occupation time pro?les (e.g., as a function of activity level), and formulate the statistical modeling in terms of these pro?les using survival and functional data analytic meth- ods. This will have a number of advantages, the principal one being that time-dependent confounders become less problematic because the effect of differences in temporal alignment across subjects is mitigated. In addition, survival analysis methods can be applied by viewing the occupation time as a time-to-event outcome indexed by activity level.
For Aim 2, nonparametric methods will be used to compare and order occupation time distributions between groups of subjects that are speci?ed in terms of baseline covariate levels or treatment groups. Further, a new method of post-selection inference based on marginal screening for function-on-scalar regression will be developed to identify and formally test whether covariates are signi?cantly associated with activity pro?les.
Aim 3 will develop an R-package implementation, and as a test-bed for the proposed methods they will be applied to two Columbia-based clinical studies: to the study of physical activity in children enrolled in New York City Head Start, and to the study of experimental drugs for the treatment of mitochondrial depletion syndrome.

Public Health Relevance

The relevance of the project to public health is that it will develop statistical methods for the physiological eval- uation of patients on the basis of data collected by inexpensive wearable sensors (e.g., accelerometers). By introducing methods for the rigorous comparison of healthcare status among groups of patients observed longi- tudinally over time using such devices, treatment decisions that can bene?t targeted populations of patients in terms of continuously-assessed health outcomes will become possible.

Agency
National Institute of Health (NIH)
Institute
National Institute on Aging (NIA)
Type
Research Project (R01)
Project #
1R01AG062401-01A1
Application #
9658873
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Phillips, John
Project Start
2019-04-15
Project End
2024-03-31
Budget Start
2019-04-15
Budget End
2020-03-31
Support Year
1
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Columbia University (N.Y.)
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
621889815
City
New York
State
NY
Country
United States
Zip Code
10032