The scientific community, industry, and general public have become increasingly concerned about a lack of replicability among published discoveries. Prominent institutions and journals have published policies on the problem, yet there is still confusion and debate regarding solutions. This proposal presents a statistical approach to the challenge of research replicability. The approach will help circumvent extensive and costly efforts and delays in the initial reporting of important findings caused by conventional solutions, while facilitating changes in how scientists evaluate and communicate research. Existing solutions include standardization with proof of replication across laboratories, systematic variation and heterogenization of experiments within laboratories, aggregation of convergent evidences and meta-analysis of highly heterogeneous data. These solutions are not always practical or even feasible, adding significant cost, time and complexity to the execution of experimental work. Furthermore, in the face of rigorous standardization, there can be discrepancies in results obtained between laboratories due to normal, unavoidable variation among laboratories in which studies are executed. We propose a solution that involves community data sharing built around the Mouse Phenome Database (MPD) to estimate and model the impact of laboratory variation on replicability. This project uses data-driven and informatics-based approaches that exploit public, large-scale, heterogeneous and complex data. The proposed project will advance knowledge by using practical and rigorous quantitative and statistical approaches to develop methods for investigators to evaluate replicability of their results prior to publication. The project is intended to provide an approach, guidelines and publicly available data resources to reduce the number of irreproducible studies that are published and improperly used as foundational research, ultimately restoring confidence in the public's investment in research through timely, cost-effective improvements in the scientific process. Validation studies will include analysis of data from multi-lab replication of behavioral genetics experiments, such as those that are being generated by addiction scientists in NIDA Centers of Excellence. The approach is readily extended to research in the natural and physical sciences far beyond behavioral genetics. The publicly available datasets and methods will enable other biostatisticians, including trainees and early career scientists, to model the problem of laboratory variation and replicability within their own research endeavors.
The scientific community and general public have become increasingly concerned about a lack of replicability among published discoveries, particularly in behavioral science, but extending to many areas of preclinical research. This proposal presents a practical approach to the challenge of research replicability that will help circumvent extensive and costly efforts and delays in the initial reporting of important findings, while facilitating changes in how scientists evaluate and communicate research. The proposed project will provide an approach, guidelines and publicly available data resources to reduce the number of irreproducible studies that are published and improperly used as foundational research, increasing the public health impact of NIDA research and ultimately restoring confidence in the public's investment in research through timely, cost-effective improvements in the scientific process.
Bogue, Molly A; Grubb, Stephen C; Walton, David O et al. (2018) Mouse Phenome Database: an integrative database and analysis suite for curated empirical phenotype data from laboratory mice. Nucleic Acids Res 46:D843-D850 |
Kafkafi, Neri; Agassi, Joseph; Chesler, Elissa J et al. (2018) Reproducibility and replicability of rodent phenotyping in preclinical studies. Neurosci Biobehav Rev 87:218-232 |