This proposal is submitted in response to NOT-OD-09-058 NIH Announces the Availability of Recovery Act Funds for Competitive Revision Applications. Health status and outcomes are frequently measured on an ordinal scale. Examples include scoring methods for liver biopsy specimens from patients with chronic hepatitis, including the Knodell hepatic activity index, the Ishak score, and the METAVIR score. In addition, tumor-node-metasis stage for cancer patients is an ordinal scaled measure. Moreover, the more recently advocated method for evaluating response to treatment in target tumor lesions is the Response Evaluation Criteria In Solid Tumors method, with ordinal outcomes defined as complete response, partial response, stable disease, and progressive disease. Traditional ordinal response modeling methods assume independence among the predictor variables and require that the number of samples (n) exceed the number of covariates (p). These are both violated in the context of high-throughput genomic studies. Our currently funded R03 grant, """"""""Recursive partitioning and ensemble methods for classifying an ordinal response,"""""""" consists of the following three specific aims (SA.1) extend the recursive partitioning and random forest classification methodologies for predicting an ordinal response by developing computational tools for the R programming environment including implementing our ordinal impurity criteria in rpart and implementing the ordinal impurity criteria in randomForest;(SA.2) evaluate the proposed ordinal classification methods in comparison to existing nominal and continuous response methods using simulated, benchmark, and gene expression datasets;and (SA.3) develop and evaluate methods for assessing variable importance when interest is in predicting an ordinal response. Recently, penalized models have been successfully applied to high-throughput genomic datasets in fitting linear, logistic, and Cox proportional hazards models with excellent performance. However, extension of penalized models to the ordinal response setting has not been described. Herein we propose to extend the L1 penalized method to ordinal response models to enable modeling of common ordinal response data when a high-dimensional genomic data comprise the predictor space. This study will expand the scope of our current research by providing a model-based ordinal classification methodology applicable for high-dimensional datasets to accompany the heuristic based classification tree and random forest ordinal methodologies considered in the parent grant.
The specific aims of this competitive revision application are to:
Aim 1) Extend the L1 penalized methodology to enable predicting an ordinal response by developing computational tools for the R programming environment;
Aim 2) Using simulated, benchmark, and gene expression datasets, evaluate L1 penalized ordinal response models by comparing error rates from our L1 fitting algorithm to those obtained when using a forward variable selection modeling strategy and our ordinal random forest approach;
and Aim 3) Evaluate methods for assessing important covariates from L1 penalized ordinal response models.
This project will develop L1 penalized ordinal response models and implement them in the R programming environment. By conducting extensive comparisons of various ordinal response modeling methods using simulated, benchmark, and gene expression datasets, we will be able to make a recommendation regarding ordinal response modeling to the scientific community. This research is significant since the ordinal response modeling methods developed during the project period will be broadly applicable to a variety of health, social, and behavioral research fields, which commonly collect responses on an ordinal scale.
Archer, K J; Williams, A A A (2012) L1 penalized continuation ratio models for ordinal response prediction using high-dimensional datasets. Stat Med 31:1464-74 |
Asomaning, N; Archer, K J (2012) High-throughput DNA methylation datasets for evaluating false discovery rate methodologies. Comput Stat Data Anal 56:1748-1756 |
Archer, Kellie J (2010) rpartOrdinal: An R Package for Deriving a Classification Tree for Predicting an Ordinal Response. J Stat Softw 34:7 |
Archer, Kellie J; Reese, Sarah E (2010) Detection call algorithms for high-throughput gene expression microarray data. Brief Bioinform 11:244-52 |
Archer, K J; Mas, V R (2009) Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set. Stat Med 28:3597-610 |