Health status and outcomes are frequently measured on an ordinal scale. For example, in acute lymphoblastic leukemia, minimal residual disease is an initial measure of treatment response that has been strongly predictive of event free survival and risk of relapse, where patients are commonly strati?ed into one of three ordinal groups: standard, intermediate, or high risk. In acute myeloid leukemia, based on cytogenetic ?ndings and selected muta- tions at diagnosis, the European LeukemiaNet (ELN) classi?cation system assigns patients into one of three risk groups: favorable, intermediate, or adverse. Molecular features monotonically associated with these ordinal re- sponses may be prognostically relevant or potential therapeutic targets, so linking these ordinal responses to data from high-throughput genomic assays is of clinical interest. We previously developed frequentist-based penalized ordinal response models and software to enable modeling an ordinal response when high-dimensional genomic data comprises the predictor space. Although frequentist-based penalized models provide a sparse solution and so perform automatic variable selection, they require some method for selecting the penalty parameter (e.g., AIC, BIC, or cross-validation) to identify a ?nal model. However, once a speci?c penalty value is selected, all parameter estimates are conditional on that value. Also, the frequentist-based approach does not yield much information about the coef?cients other than whether they are non-zero or not. That is, there are no resulting con?dence intervals or p-values associated with the coef?cient estimates. Therefore this project will ?ll a critical barrier to progress in this ?eld by developing penalized Bayesian ordinal response models applicable for high-dimensional datasets. Advantages of the Bayesian approach is that there is no need to select a value for the penalty param- eter and it yields credible intervals which provide useful interpretations about the signi?cance of each predictor. The speci?c aims of this application are to: (1) Develop penalized Bayesian cumulative link, adjacent category, and stereotype logit models for high-dimensional datasets; (2) Develop penalized Bayesian forward continuation ratio (FCR) models with a complementary log-log link that allow for censoring for high-dimensional datasets. For both aims we will characterize the performance of the methods using extensive simulation studies and application to publicly available cancer datasets, develop software, and distribute R packages to CRAN. This research will ?ll a critical gap as there are currently no Bayesian LASSO ordinal response models for high-dimensional data. Through our proposed variable inclusion indicator methodology, our Bayesian approach and software developed in this application will provide unique research methods for integrating clinical, demographic, high-throughput genomic, and ordinal response data. Moreover, the ordinal response extensions proposed in this application, though initially conceived of by considering applications to cancer, will be broadly applicable to a variety of health, social, and behavioral research ?elds, which commonly collect human preference data and other responses on an ordinal scale.
In cancer, many histopathological and clinical response variables are reported on an ordinal scale. Studies involving protocol biopsies where both histopathological assessment and high-throughput genomic studies are performed at the same time point are increasingly being performed, and the software developed in this application will provide unique research methodologies for analyzing such data. Moreover, the methods proposed in this application, though initially conceived of by considering applications to cancer, will be broadly applicable to a variety of health, social, and behavioral research ?elds, which commonly collect human preference data and other responses on an ordinal scale.