Improving statistical methods to provide better inferences and new analytical capabilities for categorical regression models would be invaluable to the medical and health-related research communities. Presently, single regression models are used extensively to identify patterns of disease-related symptoms, screen for disorders, analyze the results of clinical trials, and for the assessment and justification of public health policies. However, while single model estimation and inference is widely used in health-related studies, such approaches neglect model uncertainty, thus abrogating the opportunity to: i) detect additional statistical regularities (e.g., treatment effects, risk factors), ii) improve the precision of statistical inferences for estimation and prediction/classification (e.g., patient screening, diagnosis), iii) control for overfitting (e.g., model selection bias), and iv) include different yet highly correlated risk factors. This Phase I study investigates the feasibility of combining robust estimators and specification analysis methods within a multimodel framework to create a robust multimodel estimation and inference technology that addresses the limitations of the single model approach. Robust multimodeling is a specific type of Frequentist Model Averaging (FMA) methodology. First, an important feature of this approach is that it provides robust confidence intervals on predictions and effect sizes averaged across multiple models, which simultaneously incorporate sources of uncertainty that arise from the presence of many different (yet equally appropriate) models of the same data generating process as well as sources of uncertainty resulting from sampling error. A second feature of our robust multimodeling approach is that it has a robust Bayesian Model Averaging (BMA) interpretation. Specifically, theoretical arguments establish that all inferences are robust with respect to the presence of model misspecification. Third, previous work in the BMA and FMA literature has tended to focus upon using the "most probable" models constrained within a model space by applying Occam's Window to identify a group of best models, rather than all possible models in computationally tractable model spaces. In this Phase I study, alternative strategies for multimodel estimation and inference involving large model spaces will be empirically studied with extensive simulations using realistic models on clinical trial datasets (NIDA-CTN, NIMH-STAR*D). Finally, Phase I feasibility results will provide the preliminary research and design for the Phase II prototype software and support technology dissemination through collaborative health-related research projects to establish the essential foundation for Phase III product commercialization.

Public Health Relevance

Improving statistical methods for clinical trials and observational studies that provide better inferences from categorical regression models would be invaluable to clinical science, epidemiology, and health services research. This Phase I study builds on prior work in statistical theory, software development, and applied research to develop a multimodel paradigm that makes robust inferences in the simultaneous presence of model uncertainty, possible model misspecification, and multicollinearity. This Phase I feasibility study will: i) demonstrate the applicability of the new technology on NIH-sponsored clinical trial datasets (NIDA-CTN, NIMH-STAR*D), ii) provide the preliminary research and design for Phase II prototype software, iii) support technology dissemination through collaborative health-related research projects, and iv) establish the initial foundation for Phase III product commercialization

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
1R43GM106465-01A1
Application #
8592200
Study Section
Special Emphasis Panel (ZRG1-BBBP-V (11))
Program Officer
Sheeley, Douglas
Project Start
2013-09-20
Project End
2015-08-31
Budget Start
2013-09-20
Budget End
2014-08-31
Support Year
1
Fiscal Year
2013
Total Cost
$289,453
Indirect Cost
Name
Martingale Research Corporation
Department
Type
DUNS #
174995134
City
Plano
State
TX
Country
United States
Zip Code
75074