Understanding causes and effects is crucial in empirical science, however observed associations do not always have clear causal explanations. For instance, hospital patients who were prescribed antibiotics also tend to suffer from opportunistic infections, though we don’t expect antibiotics to cause infections. The likely explanation in this case is that an early presence of an infectious agent caused the latter infection, and the doctor prescribing antibiotics. A key obstacle to finding valid cause-effect relationships from data is the presence of hidden but causally relevant variables, like the infectious agent above. This project aims to develop new methods for drawing valid causal inferences in datasets with hidden variables while avoiding known pitfalls of existing approaches. Aside from its role in data analysis, causal literacy is an important skill for making informed choices as citizens and consumers. The investigator aims to promote this skill by incorporating causal methods into existing data science courses at Johns Hopkins University, developing a new course that will teach methods for detecting misunderstandings of causal claims, and developing a tutorial aimed at bridging the gap between machine learning and statistics in discussing and working on causal inference.

Directed acyclic graphs (DAGs) are an elegant method for reasoning about fully observed causal systems. The proposed research aims to provide a new formalism for causal systems with hidden variables that retains the advantages of DAGs, while dispensing with the disadvantages of representing hidden variables directly. This formalism captures all equality constraints in the observed marginal distribution via a regular model of a mixed graph. Success in the proposed research will significantly advance understanding of all major tasks in causal systems with hidden variables: identification, estimation, and computationally efficient probabilistic calculations. As a test bed for methodological developments, the investigator will use a dataset of electronic health records obtained in partnership with the Malone Center for Engineering in Healthcare and the Johns Hopkins Department of Surgery.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1942239
Program Officer
Rebecca Hwa
Project Start
Project End
Budget Start
2020-07-01
Budget End
2025-06-30
Support Year
Fiscal Year
2019
Total Cost
$112,695
Indirect Cost
Name
Johns Hopkins University
Department
Type
DUNS #
City
Baltimore
State
MD
Country
United States
Zip Code
21218