Clinical data collection is accelerating rapidly, and in the future it will include both provider- and patient- generated data. Hidden within this mass of noisy observational data are clues as to factors influencing disease onset and outcome. Finding ways to exploit this trove of disease data can unlock a new perspective on disease processes. We can tackle disease both from the bottom-up, from experimental data generated in the laboratory, and from the top down, from clinical phenomena observed across human populations. A particularly impactful and prevalent disease is cancer. Each tumor harbors a unique combination of mutations driving a distinct set of oncogenic processes. Targeted therapies have been proposed to pinpoint these mutations, potentially requiring a vast array of therapeutic options. Cancer treatment often fails when drug resistance arises, another result of the complex combinatorial nature of tumor alterations. Combination therapies have been proposed as an approach to interfere with multiple disease signals simultaneously. However, identifying effective drug combinations, and the cancer types in which they are effective, is experimentally infeasible, leading to a push for computational solutions. In this proposal, we combine methods from social sciences and biostatistics to find the causal effect of a drug on cancer onset from observational clinical data. Both increased and decreased cancer rates in drug-takers are of equal interest, as they can inform us of disease processes and provide clinical impact. We are particularly interested in finding drug combinations that impact cancer. These combination effects are unlikely to have been detected, and our clinical data provides a unique resource for observing the effects of tens of thousands of drug combinations. We will pool the resulting causal drug effect estimates across the many cancers present in our data. To gain insight into the cellular processes underlying clinical effect, we will examine the impact of known cancer-causing drugs in vitro, using large public cell line assays. The accompanying goal is to provide Dr. Rachel Melamed with a career development experience to become an independent scientist. Her research will use observational health data to understand the genesis of cancer, prevent the disease, and discover new therapeutic options. This proposal takes advantage of the interdisciplinary strengths of the University of Chicago in computation, biostatistics, and medicine, as well as institutional resources in terms of data access and infrastructure. Dr. Melamed has assembled a team consisting of complementary mentors and collaborators with expertise in computation, statistics, translational medicine, personalized therapy, and cancer therapy. The career development plan focuses on enhancing her statistics and machine learning skills with structured coursework and mentorship, and gaining experience in biomedical applications via applied work and mentorship. This will provide Dr. Melamed with skills to model observational data and to integrate the results with experimental data.

Public Health Relevance

Clinical data has been shown to hold patterns relating drugs to cancer onset. Using this data to find drugs that increase cancer rates will provide insight into the disease and an opportunity for preventing some cancer cases. Discovery of drugs, and particularly drug combinations, that reduce cancer rates could suggest low-cost new therapeutic options.

National Institute of Health (NIH)
National Institute of Environmental Health Sciences (NIEHS)
Research Scientist Development Award - Research & Training (K01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Shreffler, Carol A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Chicago
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code