Combining information from multiple studies continues to be a cost-effective approach in biomedical research. In traditional statistical literature, the associated analytic method is coined as meta analysis. However, the statistical tools for meta analysis were developed under rather restricted settings. In developing the next generation meta analysis methods, there are many new challenges ranging from increasing the robustness of traditional meta analysis to enhancing the protection of data privacy in sharing patient level information. In this proposal, we aim to address several important analytic issues that arise from combining multiple studies. We expect that the planned methodological development will be able to provide a general framework to effective information pooling from various sources. We also aim to facilitate the development of new regulatory pathways to integrate real world evidences into the drug development process. The proposal contains three specific aims.
In Specific Aim 1, we plan to develop valid and general random effects meta-analysis inferential procedures allowing the number of studies to be small or the study-specific treatment effect estimator to be irregular, where the statistical inference based on traditional random effects models fails.
In Specific Aim 2, we plan to develop robust and efficient procedures for estimating treatment effects by synthesizing information from real world evidence data and randomized clinical trials. The broad patient population and detailed patient information make large database such as electronic medical records a valuable source for precision medicine research. Effectively extracting rich information from real world evidence data has thus become a pressing need. In this aim, we propose to develop an adaptive causal inferential procedure based on multiple studies to correct biases from various sources under relaxed assumptions.
In Specific Aim 3, we propose to develop optimal estimation/prediction procedures based on data from multiple sources in the presence of the data privacy concern and between study heterogeneity. The first part of the aim is about a divide-and-conquer strategy bypassing the need of patient level data to alleviate the privacy concern in data sharing. The second part of the aim is about a set of statistical learning methods for predicting patients? future outcome and selecting the optimal treatment accounting for between study heterogeneities, when patient level data can be shared.
(Public Health Relevance Statement) Pooling information from multiple sources including real world evidence data is a highly cost-effective approach in biomedical research. The traditional statistical tool, i.e, the meta analysis, was developed under rather restricted settings and we propose to develop novel methodology for the next generation meta analysis in the era of big data. We plan to apply the new methods to the area of precision medicine including individualized diagnosis and treatment recommendation.
Showing the most recent 10 out of 44 publications