Health status and outcomes are frequently measured on an ordinal scale. Examples include scoring methods for liver biopsy specimens from patients with chronic hepatitis, including the Knodell hepatic activity index, the Ishak score, and the METAVIR score. In addition, tumor-node-metasis stage for cancer patients is an ordinal scaled measure. Moreover, the more recently advocated method for evaluating response to treatment in target tumor lesions is the Response Evaluation Criteria In Solid Tumors method, with ordinal outcomes defined as complete response, partial response, stable disease, and progressive disease. Traditional ordinal response modeling methods assume independence among the predictor variables and require that the number of samples (n) exceed the number of covariates (p). These are both violated in the context of high-throughput genomic studies. Recently, penalized models have been successfully applied to high-throughput genomic datasets in fitting linear, logistic, and Cox proportional hazards models with excellent performance. However, extension of penalized models to the ordinal response setting has not been fully described nor has software been made generally available. Herein we propose to apply the L1 penalization method to ordinal response models to enable modeling of common ordinal response data when a high-dimensional genomic data comprise the predictor space. This study will expand the scope of our current research by providing additional model-based ordinal classification methodologies applicable for high-dimensional datasets to accompany the heuristic based classification tree and random forest ordinal methodologies we have previously described.
The specific aims of this application are to: (1) Develop R functions for implementing the stereotype logit model as well as an L1 penalized stereotype logit model for modeling an ordinal response. (2) Empirically examine the performance of the L1 penalized stereotype logit model and competitor ordinal response models by performing a simulation study and applying the models to publicly available microarray datasets. (3) Develop an R package for fitting a random-effects ordinal regression model for clustered ordinal response data. (4) Extend the random-effects ordinal regression model to include an L1 penalty term to accomodate high-dimensional covariate spaces and empirically examine the performance of the L1random-effects ordinal regression model through application to microarray data. Studies involving protocol biopsies where both histopathological assessment and microarray studies are performed at the same time point are increasingly being performed, so that the methodology and software developed in this application will provide unique informatic methods for analyzing such data. Moreover, the ordinal response extensions proposed in this application, though initially conceived of by considering microarray applications, will be broadly applicable to a variety of health, social, and behavioral research fields, which commonly collect human preference data and other responses on an ordinal scale.

Public Health Relevance

Most histopathological variables are reported on an ordinal scale. Studies involving protocol biopsies where both histopathological assessment and microarray studies are performed at the same time point are increasingly being performed, and the software developed in this application will provide unique informatic tools for analyzing such data. Moreover, the informatic methods proposed in this application, though initially conceived of by con- sidering microarray applications, will be broadly applicable to a variety of health, social, and behavioral research fields, which commonly collect human preference data and other responses on an ordinal scale.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
1R01LM011169-01
Application #
8216289
Study Section
Special Emphasis Panel (ZLM1-ZH-C (01))
Program Officer
Ye, Jane
Project Start
2012-09-01
Project End
2016-08-31
Budget Start
2012-09-01
Budget End
2013-08-31
Support Year
1
Fiscal Year
2012
Total Cost
$255,679
Indirect Cost
$83,898
Name
Virginia Commonwealth University
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
105300446
City
Richmond
State
VA
Country
United States
Zip Code
23298
Ferber, Kyle; Archer, Kellie J (2015) Modeling discrete survival time using genomic feature data. Cancer Inform 14:37-43
Hou, Jiayi; Archer, Kellie J (2015) Regularization method for predicting an ordinal response using longitudinal high-dimensional genomic data. Stat Appl Genet Mol Biol 14:93-111
Johnson, Ryan M; Vu, Ngoc T; Griffin, Brian P et al. (2015) The Alternative Splicing of Cytoplasmic Polyadenylation Element Binding Protein 2 Drives Anoikis Resistance and the Metastasis of Triple Negative Breast Cancer. J Biol Chem 290:25717-27
Makowski, Mateusz; Archer, Kellie J (2015) Generalized monotone incremental forward stagewise method for modeling count data: application predicting micronuclei frequency. Cancer Inform 14:97-105
Gentry, Amanda Elswick; Jackson-Cook, Colleen K; Lyon, Debra E et al. (2015) Penalized Ordinal Regression Methods for Predicting Stage of Cancer in High-Dimensional Covariate Spaces. Cancer Inform 14:201-8
Zhou, Qing; Jackson-Cook, Colleen; Lyon, Debra et al. (2015) Identifying molecular features associated with psychoneurological symptoms in women with breast cancer using multivariate mixed models. Cancer Inform 14:139-45
Archer, Kellie J; Hou, Jiayi; Zhou, Qing et al. (2014) ordinalgmifs: An R Package for Ordinal Regression in High-dimensional Data Settings. Cancer Inform 13:187-95
Archer, K J; Williams, A A A (2012) L1 penalized continuation ratio models for ordinal response prediction using high-dimensional datasets. Stat Med 31:1464-74
Asomaning, N; Archer, K J (2012) High-throughput DNA methylation datasets for evaluating false discovery rate methodologies. Comput Stat Data Anal 56:1748-1756
Archer, Kellie J; Reese, Sarah E (2010) Detection call algorithms for high-throughput gene expression microarray data. Brief Bioinform 11:244-52

Showing the most recent 10 out of 12 publications