Despite the many advances in cancer diagnosis and therapy, cancer remains the second leading cause of death in the US. This is due in part to lack of adequate methods for early detection [1-3]. Multiple studies cite the need for a noninvasive test that would be both sensitive and specific for a particular cancer [5-8]. A diagnostic test based on multiple biomarkers may be necessary to achieve the desired sensitivity and specificity [12]. Statistical methodologies that can model complex biologic interactions and that are easily interpretable allow for the translation of biomarker research into diagnostic tools. Logic regression, a relatively new multivariable regression method that predicts binary outcomes using logical combinations of binary predictors, has the capability to model the complex interactions in biologic systems in easily interpretable models [9]. The three specific aims for this proposal will develop and assess new statistical methods that extend the capability of current logic regression methodology to improve identification and evaluation of combinations of biomarkers for cancer.
Aim 1 extends logic regression from a single logic tree model to an ensemble of logic trees for analysis of binary predictors and a binary outcome. The ensemble model will classify binary outcomes by popular vote in the ensemble. 1a: The predictive accuracy of the ensemble of logic trees model will be compared to the predictive accuracy of a single logic tree model and competing ensemble methods. 1b: The ability of the ensemble model to correctly identify important individual predictors will be evaluated. 1c: The ability of the ensemble to correctly identify combinations of predictors will be evaluated.
Aim 2 will extend logic regression methodology to handle classification of ordinal outcomes rather than binary outcomes only.
Aim 3 will extend the method developed in aim 2 from a model including only a single ordinal logic tree to an ensemble of ordinal logic trees model for analysis of binary predictors and ordinal outcomes. 3a: Assess the ability of an ensemble of ordinal logic trees to correctly classify observations compared to a single ordinal logic tree model and competing ensemble methods. 3b: The ability of an ensemble of ordinal logic trees to correctly identify important predictors will also be evaluated. 3c: The ability of an ensemble of ordinal logic trees to correctly identify important combinations of predictors will be evaluated. This research fits the NCI Division of Cancer Prevention goal of """"""""Identification, development, and evaluation of biological analytic techniques, methodologies, and clinical technologies relevant to pre-clinical cancer detection and prevention of primary and recurrent cancers."""""""" This research is relevant to public health because the objective is to provide analytic tools that will aid in the development of tools for use in cancer risk assessment, screening, prognosis, and treatment which have the potential to decrease the rate of cancer related mortality.
Diagnostic tests based on multiple biomarkers are needed to improve cancer risk assessment, early detection, prognosis and treatment. This research develops and evaluates new statistical methods for identifying complex combinations of binary biomarkers predictive of disease state. These methods have promise for greater sensitivity to the intricate biological interactions associated with cancer initiation and progression, thereby improving diagnosis and outcomes.
Wolf, Bethany J; Slate, Elizabeth H; Hill, Elizabeth G (2015) Ordinal Logic Regression: A classifier for discovering combinations of binary markers for ordinal outcomes. Comput Stat Data Anal 82:152-163 |
Hill, E G; Slate, E H (2014) A SEMI-PARAMETRIC BAYESIAN MODEL OF INTER- AND INTRA-EXAMINER AGREEMENT FOR PERIODONTAL PROBING DEPTH. Ann Appl Stat 8:331-351 |
Tsoi, Lam C; Qin, Tingting; Slate, Elizabeth H et al. (2011) Consistent Differential Expression Pattern (CDEP) on microarray to identify genes related to metastatic behavior. BMC Bioinformatics 12:438 |
Wolf, Bethany J; Hill, Elizabeth G; Slate, Elizabeth H (2010) Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. Bioinformatics 26:2183-9 |