Data Science Research

Cooper, Gregory; Bahar, Ivet; Berg, Jeremy

Abstract

Much of science, including biomedical science, consists of discovering and modeling causal relationships. Increasingly, biomedical scientists have available multiple complex data types and a very large number of samples, each of which has an enormous number of measurements recorded.. There is a pressing need for algorithms that can efficiently discover causal relationships from large and diverse types of biomedical data and background knowledge. In the past 25 years, tremendous progress has been made in developing general computational methods for representing and discovering causal knowledge from data. However, these methods are not readily available, nor easy to use by biomedical scientists, and they have not been designed to exploit the increasingly Big Data available for analysis. The proposed Center will create a computer ecosystem through which to implement and apply an integrated set of tools, new and repurposed, that support the representation and discovery of causal knowledge from large and complex biomedical data. These computational approaches will be accessible to a wide variety of biomedical researchers, data analysts, and data scientists who might not otherwise take advantage of them. Three very different biomedical problems will drive the development of the methods, tools, and interactive system architecture. While we anticipate that new biomedical discoveries will be made in each of these problem areas using the methods developed by the Center, the longer-term impact will result from the development of the computational technology itself, which will be generalizable to the full spectrum of biomedical research. The Center will be very active in the sharing of these knowledge, methods, and tools through a rich offering of training activities and through engagement with other Centers in the consortium.

Public Health Relevance

There is a pressing need for new computational methods that can assist biomedical scientists in discovering causal knowledge from large and complex biomedical datasets. The proposed Center will develop and make freely available a suite of such methods for use by biomedical scientists, data analysts, and data scientists. The Center will also provide training about the methods and engage actively with other Centers.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Specialized Center--Cooperative Agreements (U54)
Project #: 1U54HG008540-01
Application #: 8932078
Study Section: Special Emphasis Panel (ZRG1-BST-R (52))
Program Officer: Brooks, Lisa

Project Start: 2014-09-29
Project End: 2018-08-31
Budget Start: 2014-07-01
Budget End: 2015-06-30
Support Year: 1
Fiscal Year: 2014
Total Cost: $1,585,052
Indirect Cost: $389,973

Institution

Name: University of Pittsburgh
Department
Type
DUNS #: 004514360

City: Pittsburgh
State: PA
Country: United States
Zip Code: 15213

Related projects

Publications

Andrews, Bryan; Ramsey, Joseph; Cooper, Gregory F (2018) Scoring Bayesian Networks of Mixed Variables. Int J Data Sci Anal 6:3-18

Raghu, Vineet K; Ramsey, Joseph D; Morris, Alison et al. (2018) Comparison of strategies for scalable causal discovery of latent variable models from mixed data. Int J Data Sci Anal 6:33-45

Huang, Biwei; Zhang, Kun; Lin, Yizhu et al. (2018) Generalized Score Functions for Causal Discovery. KDD 2018:1551-1560

Zhang, Kun; Schölkopf, Bernhard; Spirtes, Peter et al. (2018) Learning causality and causality-related learning: some recent progress. Natl Sci Rev 5:26-29

Meyer, Wynn K; Jamison, Jerrica; Richter, Rebecca et al. (2018) Ancient convergent losses of Paraoxonase 1 yield potential risks for modern marine mammals. Science 361:591-594

Naeini, Mahdi Pakdaman; Cooper, Gregory F (2018) Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models. Knowl Inf Syst 54:151-170

Lu, Songjian; Fan, Xiaonan; Chen, Lujia et al. (2018) A novel method of using Deep Belief Networks and genetic perturbation data to search for yeast signaling pathways. PLoS One 13:e0203871

Ponzoni, Luca; Bahar, Ivet (2018) Structural dynamics is a determinant of the functional significance of missense variants. Proc Natl Acad Sci U S A 115:4164-4169

Ding, Michael Q; Chen, Lujia; Cooper, Gregory F et al. (2018) Precision Oncology beyond Targeted Therapy: Combining Omics Data with Machine Learning Matches the Majority of Cancer Cells to Effective Therapeutics. Mol Cancer Res 16:269-278

Sedgewick, Andrew J; Buschur, Kristina; Shi, Ivy et al. (2018) Mixed Graphical Models for Integrative Causal Analysis with Application to Chronic Lung Disease Diagnosis and Prognosis. Bioinformatics :

Showing the most recent 10 out of 61 publications

Comments

Be the first to comment on Gregory Cooper's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: