Increased availability of data and accessibility of computational tools in recent years have created an unprecedented upsurge of scientific studies driven by statistical analysis. Limitations inherent to statistics impose constraints on the reliability of conclusions drawn from data, and misuse of statistical methods is a growing concern. We have been developing tools for assessing predictability of common measures of statistical significance of research findings. These methods operate on test statistics or P-values as summaries of data and also incorporate external or prior information for making inference about uncertainty in statistics or parameters of interest, such as P-values or risk of disease. In a currently submitted manuscript we develop approximate Bayesian methods that use information contained in P-values, but overcome their flaws and limitations. A preprint of this research is available at https://doi.org/10.1101/714287 In a manuscript that is tentatively accepted to Frontiers in Genetics -- Statistical Genetics and Methodology we develop new statistical methods to combine top-ranking statistical associations. These methods can be used in observational studies to detect an aggregated effect of multiple weak predictors on complex disorders. They are also being applied in collaborative project with Dr. Gordenin's group to explore patterns of somatic mutations in cancer genomes. Without doubt, practical applications, as well as methodological extensions of methods based on top-ranking statistics, are hindered by their computational complexity. In the course of this work we derived the exact distribution of the rank truncated product (RTP) that substantially simplifies its evaluation. We also suggested an efficient adaptive method that does not require time consuming computer simulations and developed extensions for combining correlated effects with substantial gain in power compared to previously published methods. Further, we proposed a highly promising combination statistic that captures main features of RTP but has higher power and can be implemented using an elementary R code. Preprints of this research are available at https://doi.org/10.1101/665133 and at https://doi.org/10.1101/667238 (both preprints are currently submitted for publication).
Showing the most recent 10 out of 29 publications