Improving statistical methods to provide better classification performance and new analytical capabilities for categorical regression would be invaluable to the medical and health care research communities. Categorical regression models (e.g., binary logistic, multinomial logistic) are used extensively to identify patterns of alcohol-related symptoms, screen for disorders, and assess policies. In addition, such models are used extensively in other areas of research such as mental illness, cancer, traumatic injuries, and AIDS-related pathologies. However, many such models are developed with inadequate support to fully analyze and exploit the intrinsically probabilistic nature of their results. This is of critical importance as health researchers, clinicians, and administrators are often faced with classification decisions using categorical regression models to identify unacceptable risks, adequate outcomes, and acceptable guidelines for screening, diagnoses, treatment, and quality of care. Commercially available statistical software does not offer sophisticated methods for robust estimation of posterior probabilities in the presence of model misspecification, missing covariates, and nonignorable missing data generating processes. Such robust missing data handling methods provide natural mechanisms for dealing with verification bias and modeling correlated, longitudinal, or survey data with complex sampling designs. Moreover, commercially available statistical software does not provide automated methods for using estimated posterior probabilities to make optimal classification decisions with respect to different optimality criteria. In particular, automated features such as optimizing multiple decision criteria (allocation rules) that trade off specificity against sensitivity, decision threshold confidence intervals, statistical tests for evaluating correct specification of posterior probabilities, statistical tests for comparing competing classifier thresholds, and methods for multi-outcome classification and inference are not readily available. Phase II research will extend Phase I findings for binary logistic regression to develop and implement automated robust classification methods for multinomial logistic regression modeling, which also applies to the larger class of nonlinear categorical regression models that output posterior probabilities. The Phase II software prototype will provide: 1) new user-selectable robust decision threshold estimators, 2) robust confidence intervals on decision threshold estimators, 3) new classifier threshold comparison tests, 4) new outcome probability specification tests, 5) efficient missing data handling methods in the presence of nonignorable nonresponse data, and 6) second-order analytic and simulation-based Bayesian methods for improved small sample and rare event outcome probability estimation. These new methodologies will be integrated into a prototype user-friendly software package, evaluated with extensive simulation studies, and then applied to real world classification problems encountered in: alcohol, mental illness (depression, bipolar, schizophrenia), cancer (prostate), trauma (emergency room), and infectious disease (AIDS) through collaborations with domain experts in those respective fields. In summary, Phase II research will establish the essential technical foundation for Phase III commercialization with the objective of providing a suite of new classification analysis methods as an advanced statistical tool that improves epidemiologic, clinical, and public health research.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Small Business Innovation Research Grants (SBIR) - Phase II (R44)
Project #
5R44CA139607-04
Application #
7917387
Study Section
Special Emphasis Panel (ZRG1-HOP-E (10))
Program Officer
Evans, Gregory
Project Start
2003-06-04
Project End
2012-12-31
Budget Start
2010-09-01
Budget End
2012-12-31
Support Year
4
Fiscal Year
2010
Total Cost
$990,520
Indirect Cost
Name
Martingale Research Corporation
Department
Type
DUNS #
174995134
City
Plano
State
TX
Country
United States
Zip Code
75074
Henley, Steven S; Kashner, T Michael; Golden, Richard M et al. (2016) Response to letter regarding ""A systematic approach to subgroup analyses in a smoking cessation trial"". Am J Drug Alcohol Abuse 42:112-3
Westover, Arthur N; Kashner, T Michael; Winhusen, Theresa M et al. (2015) A systematic approach to subgroup analyses in a smoking cessation trial. Am J Drug Alcohol Abuse 41:498-507
Brakenridge, Scott C; Henley, Steven S; Kashner, T Michael et al. (2013) Comparing clinical predictors of deep venous thrombosis versus pulmonary embolus after severe injury: a new paradigm for posttraumatic venous thromboembolism? J Trauma Acute Care Surg 74:1231-7; discussion 1237-8
Brakenridge, Scott C; Phelan, Herb A; Henley, Steven S et al. (2011) Early blood product and crystalloid volume resuscitation: risk association with multiple organ dysfunction after severe blunt traumatic injury. J Trauma 71:299-305
Trivedi, Madhukar H; Greer, Tracy L; Church, Timothy S et al. (2011) Exercise as an augmentation treatment for nonremitted major depressive disorder: a randomized, parallel dose comparison. J Clin Psychiatry 72:677-84
Kaminetzky, Catherine P; Keitz, Sheri A; Kashner, T Michael et al. (2011) Training satisfaction for subspecialty fellows in internal medicine: findings from the Veterans Affairs (VA) Learners' Perceptions Survey. BMC Med Educ 11:21
Byrne, John M; Kashner, Michael; Gilman, Stuart C et al. (2010) Measuring the intensity of resident supervision in the department of veterans affairs: the resident supervision index. Acad Med 85:1171-81
Kashner, T Michael; Byrne, John M; Chang, Barbara K et al. (2010) Measuring progressive independence with the resident supervision index: empirical approach. J Grad Med Educ 2:17-30
Kashner, T Michael; Henley, Steven S; Golden, Richard M et al. (2010) Studying the effects of ACGME duty hours limits on resident satisfaction: results from VA learners' perceptions survey. Acad Med 85:1130-9