The overall aim of the Data Analysis and Signature Generation Core is to use the experimental data from the Data Generation Core for multi-tier analyses that combine human interactome- and Gene Ontology-based statistical network models with dynamical multi-compartment ODE models to obtain multiple types of signatures. The data are characterized by the three LINCS data matrix axes;cell types are cardiomyocytes, hepatocytes, and neurons from cell lines and derived from human induced pluripotent stem cells; perturbagens are 130 individual drugs that induce cardiotoxicity, hepatotoxicity, and peripheral neuropathy and 120 drug combinations where a second drug mitigates the toxicity;assays are microarray for mRNA levels, mass spec for protein levels, and microwestern for protein state levels. The first signature tier is experimentally-observed signatures (EOS). These are lists of differentially expressed genes based on mRNA alone, protein alone, and their combination. We expect that the combination of mRNA and protein data will be synergistic for generating signatures. The second tier is network-inferred signatures (NIS). This takes EOS as input and combines it with prior knowledge (human interactome, gene expression omnibus) to generate biological networks, in the form of bipartite graphs associated with drug toxicity and its potential mitigation. Analysis of these networks will inform drug-relevant pathways for assay by microwestern array; this crosstalk between cores is a key integration point of this center proposal. The third tier is dynamical model-weighted signatures (wNIS and wEOS). The key pathways identified above by NIS will inform the development of dynamical, multi-compartment ODE models which will be constrained by the microwestern array data. Global sensitivity analysis of these dynamical models will quantify the fragility of nodes in associated networks. Thus we will weight elements of previously developed signatures by their dynamical model-predicted fragility. We expect this signature pipeline to produce roughly 4,000 signatures per year. Finally, we will use these signatures as input to build classifiers of drug-induced toxicity, using "matched" (cardiomyocyte data for a cardiotoxic drug) vs. "unmatched" conditions as training controls. The entire signature generating pipeline and its output will be annotated with a deep level of integrated quality control to ensure proper dissemination of our models and signatures to the public.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-CB-D (50))
Program Officer
Ajay, Ajay
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Icahn School of Medicine at Mount Sinai
New York
United States
Zip Code