This award provides support for a workshop on Case Studies of Causal Discovery with Model Search, to be held at Carnegie Mellon University in October 2012. Computer scientists, statisticians, and philosophers have created a precise mathematical framework for representing causal systems called "Graphical Causal Models." This framework has supported the rigorous description of causal model spaces and the notion of empirical indistinguishability/equivalence within such spaces, which has in turn enabled computer scientists to develop asymptotically reliable model search algorithms for efficiently searching these spaces. However, the conditions under which these methods are practically useful in applied science are unknown. The investigators will hold a 3-day workshop on methodological issues germane to practical causal discovery that brings together scholars from genetics, biology, economics, fMRI-based cognitive neuroscience, climate research, public health, sociology, and education research, all of whom have successfully applied computerized search for causal models. The goals of the workshop are to: (1) to identify strategies for applying causal model search to diverse domain-specific scientific questions; (2) to identify and discuss methodological challenges that arise when applying causal model search to real-world scientific problems; and (3) to take concrete steps toward creating an interdisciplinary community of researchers interested in applied causal model search.

The project will advance our scientific understanding of conditions under which one can efficiently and reliably gain causal knowledge from both experimental and non-experimental data, especially in contexts in which background theory is weak. It will provide a set of real-world cases, problems, and domains that will spur practical advances in causal discovery, which affects people's everyday lives through public policy, medical, social, and behavioral sciences. Specific products of the workshop include plans for a book-length volume of case-studies, possibly a blueprint for an online, sustainable, extendable repository of data sets and supporting analyses, and a plan for ongoing workshops.

Project Report

In October, 2013, we held a workshop at Carnegie Mellon University in which economists, biologists, psychologists, educational researchers, statisticians, computer scientists and philosophers presented scientific applications of causal model discovery, a field that has exploded methodologically over the last few decades but which is just beginning to find its way into meaningful scientific applications. The workshop opened with a 3-hour tutorial on causal model discovery, which is available in its entirety at www.hss.cmu.edu/philosophy/casestudiesworkshop.php. Three economists presented work on time series data addressing the causal relationships driving a) demand/price relationships for meat, b) macroeconomic relationships involving money supply and federally determined interest rates, and c) the causes and effects of the level of international trade within a given economy. Four brain researchers presented work on a) using fMRI data to discover the processing cascade between brain regions in a cognitive task, b) using fMRI data to discover the difference between the causal structure of the brain in neuro-typical and autistic subjects, c) using fMRI data to separate the lagged from contemporaneous effects in brain region relations, and d) a comparison of discovery results on fMRI and MEG data. Educational researchers presented an analysis which suggested that teaching students to understand fractions before becoming fluent in working with them will be educationally more effective, a prediction that was validated in a follow-up experiment. Biologists presented a case of causal discovery involving learning the causal determinants of leaf economics worldwide. Four genetics researchers described analyses involving a) detecting the genetic regulatory structure affecting flowering in the Arabidopsis genome, b) methods for extremely high dimensional genetic data, c) causal inference for signaling pathways in cancer vs. normal cells from single cell RNA data, and d) causal inference from mass-cytometry data on protein concentrations in single cells. Finally, climate scientists used causal discovery technology to discover causal graphs of information flow in long term climate change. Bringing together researchers in all of these disciplines, from locations in Canada, Europe, and all over the U.S., was extremely useful in identifying several successful strategies for applied causal discovery and in identifying important theoretical challenges for the field moving forward. The attendees were very excited with the interdisciplinary insights of the workshop, and are committed to holding similar workshops in the future, either every year or every other year. Common strategies that emerged involved methods a) to separate ‘fast acting’ vs. ‘lagged’ effects, b) to separating a robust causal signal from algorithmic noise, and c) to handle high-dimensional data (thousands of variables). Challenges that emerged included a) finding guidelines for useful "sampling rates" for collecting time-series data, b) how to "aggregate" data, e.g., tissue data as opposed to single cell data in genetics or brain region activity in fMRI vs. single neuron recording, c) fusing data from different sources, e.g., fMRI, EEG, PET in neuroscience, or mRNA, proteomics, GWAS in genetics, etc., and d) the effect of parametric assumptions. Almost every talk in the workshop was video-taped and is available, along with the accompanying slides, at www.hss.cmu.edu/philosophy/casestudiesworkshop.php.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
1156001
Program Officer
Cheryl Eavey
Project Start
Project End
Budget Start
2012-05-01
Budget End
2014-04-30
Support Year
Fiscal Year
2011
Total Cost
$45,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213