Next-generation sequencing has brought revolutionary genome-wide, dense resolution and high-throughput capability to perform various types of omics analyses, including gene expression, methylation, fusion gene, somatic mutation and many others. With the dropping costs, the technology is gaining popularity. The experimental expenses, however, remain significant and power calculation tools are essential to adequately design and guide an NGS analysis. Unlike power calculation in traditional experiments or microarrays, power calculation in NGS require simultaneous consideration of sample size and sequencing depth and count-data also bring statistical challenges. We propose the following aims in this proposal: (1a) Develop power calculation tools for differential expression analysis from RNA-seq experiments. Optimal sample size and sequencing depth are jointly determined by power function and budget constraints. (1b) Develop power calculation tools for differential methylation in methyl-seq experiments. (2a) Develop power calculation tools for fusion gene detection in cancer using RNA-seq. Identify sample size and sequencing depth needed for fusion genes with low prevalence and low allelic-fraction. (2b) Perform additional ultra-deep sequencing in the preliminary prostate study to identify additional low-allelic-fraction and prognosis predictive fusion genes. Successful completion of these aims will provide state-of-the-art power calculation tools for the fast growing projects using NGS technology for candidate marker and fusion gene detection.
Next generation sequencing studies have been widely conducted for biomarker detection, including differentially expressed genes, differentially methylated genes and fusion genes. These types of studies are costly and have considerable experimental design issues. We will develop power calculation tools to provide practical guidance to the planning of these studies.