Much of science, including biomedical science, consists of discovering and modeling causal relationships. Increasingly, biomedical scientists have available multiple complex data types and a very large number of samples, each of which has an enormous number of measurements recorded.. There is a pressing need for algorithms that can efficiently discover causal relationships from large and diverse types of biomedical data and background knowledge. In the past 25 years, tremendous progress has been made in developing general computational methods for representing and discovering causal knowledge from data. However, these methods are not readily available, nor easy to use by biomedical scientists, and they have not been designed to exploit the increasingly Big Data available for analysis. The proposed Center will create a computer ecosystem through which to implement and apply an integrated set of tools, new and repurposed, that support the representation and discovery of causal knowledge from large and complex biomedical data. These computational approaches will be accessible to a wide variety of biomedical researchers, data analysts, and data scientists who might not otherwise take advantage of them. Three very different biomedical problems will drive the development of the methods, tools, and interactive system architecture. While we anticipate that new biomedical discoveries will be made in each of these problem areas using the methods developed by the Center, the longer-term impact will result from the development of the computational technology itself, which will be generalizable to the full spectrum of biomedical research. The Center will be very active in the sharing of these knowledge, methods, and tools through a rich offering of training activities and through engagement with other Centers in the consortium.

Public Health Relevance

There is a pressing need for new computational methods that can assist biomedical scientists in discovering causal knowledge from large and complex biomedical datasets. The proposed Center will develop and make freely available a suite of such methods for use by biomedical scientists, data analysts, and data scientists. The Center will also provide training about the methods and engage actively with other Centers.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Specialized Center--Cooperative Agreements (U54)
Project #
5U54HG008540-04
Application #
9281030
Study Section
Special Emphasis Panel (ZRG1-BST-R)
Project Start
Project End
Budget Start
2017-05-01
Budget End
2018-04-30
Support Year
4
Fiscal Year
2017
Total Cost
$2,219,517
Indirect Cost
$584,150
Name
University of Pittsburgh
Department
Type
Domestic Higher Education
DUNS #
004514360
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Raghu, Vineet K; Ramsey, Joseph D; Morris, Alison et al. (2018) Comparison of strategies for scalable causal discovery of latent variable models from mixed data. Int J Data Sci Anal 6:33-45
Huang, Biwei; Zhang, Kun; Lin, Yizhu et al. (2018) Generalized Score Functions for Causal Discovery. KDD 2018:1551-1560
Zhang, Kun; Schölkopf, Bernhard; Spirtes, Peter et al. (2018) Learning causality and causality-related learning: some recent progress. Natl Sci Rev 5:26-29
Meyer, Wynn K; Jamison, Jerrica; Richter, Rebecca et al. (2018) Ancient convergent losses of Paraoxonase 1 yield potential risks for modern marine mammals. Science 361:591-594
Naeini, Mahdi Pakdaman; Cooper, Gregory F (2018) Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models. Knowl Inf Syst 54:151-170
Lu, Songjian; Fan, Xiaonan; Chen, Lujia et al. (2018) A novel method of using Deep Belief Networks and genetic perturbation data to search for yeast signaling pathways. PLoS One 13:e0203871
Ponzoni, Luca; Bahar, Ivet (2018) Structural dynamics is a determinant of the functional significance of missense variants. Proc Natl Acad Sci U S A 115:4164-4169
Ding, Michael Q; Chen, Lujia; Cooper, Gregory F et al. (2018) Precision Oncology beyond Targeted Therapy: Combining Omics Data with Machine Learning Matches the Majority of Cancer Cells to Effective Therapeutics. Mol Cancer Res 16:269-278
Sedgewick, Andrew J; Buschur, Kristina; Shi, Ivy et al. (2018) Mixed Graphical Models for Integrative Causal Analysis with Application to Chronic Lung Disease Diagnosis and Prognosis. Bioinformatics :
Andrews, Bryan; Ramsey, Joseph; Cooper, Gregory F (2018) Scoring Bayesian Networks of Mixed Variables. Int J Data Sci Anal 6:3-18

Showing the most recent 10 out of 61 publications