Cancer biomarker research is flourishing across academia, industry and government, buoyed by technological advancements, accumulation of research and data integration. Despite the large number biomarker discovery studies there have been surprisingly few studies that have examined sample size requirements for discovery studies and very minimal publicly available tools exist today that allow researchers to conduct power analyses to determine sample size requirements for biomarker discovery studies. The lack of power analysis tools suggests that most biomarker studies are undertaken without proper power analyses, increasing the risk of inadequate sample sizes. Further, there are no tools that account for disease heterogeneity, which has been shown to dramatically increase sample size requirements and significantly alter the relative power of different analytic strategie for biomarker selection. For example, with 100 cases and 100 controls, the ordinary t-test detects 99% of biomarkers for a homogeneous disease, but only 18% for a heterogeneous disease. Thus, a biomarker study using the t-test and powered for a homogeneous disease would have only one-fifth the anticipated power if the disease is heterogeneous. Less commonly-used methods, such as the partial AUC, perform well for heterogeneous diseases but poorly for homogeneous diseases. The implications are significant: studies to identify biomarkers for the early detection of heterogeneous diseases require different statistical selection methods and larger sample sizes than if the disease were homogeneous. Thus, previous biomarker discovery studies may have failed to identify biomarkers because they were underpowered for a heterogeneous disease or because they used selection methods that were inappropriate for heterogeneous diseases. This project will develop publicly available tools to help both statisticians and non-statisticians in the planning of biomarker studies. The project has the following specific aims:
Aim 1 : Develop power analysis tools for planning biomarker discovery studies. These tools will enable researchers to determine sample size requirements and examine multiple analytic methods for discovery research while properly accounting for disease heterogeneity.
Aim 2 : Extend the tools to support power analyses for construction of biomarker signatures.
The proposed research would develop critical tools for the planning and analysis of biomarker studies, which will assist biomarker researchers to discover new cancer biomarkers. These biomarkers may enable the earlier detection of cancer, guide therapeutic choices, provide a means to monitor response to therapy, and lead to the development of novel therapeutic options for cancer.