Many forms of biomolecular (e.g., gene expression, genetics, proteomics) and clinical (e.g., clinical biomarkers, drug targets and indications) data pertaining to many different diseases are now readily available from publicly- available data repositories and knowledge-bases. There is now an opportunity to integrate these data into a unified, globally coherent representation of human disease, or nosology. Such a nosology would express how diseases are related to one another across multiple molecular and clinical axes. In this competitive renewal, we are planning a major expansion for this project. We plan to capture data from newer public repositories with more types of molecular measurements. Inclusion of genetic and protein measurements will enable a richer modeling of diseases and disease similarity, beyond mRNA measurements. To help link the molecular changes seen in disease to genetic differences, we plan to incorporate Expression Quantitative Trait Loci (eQTLs) into our disease models, built from simultaneous genetic and expression measurements. To expand the utility of our nosology in personalized medicine, we plan to incorporate more quantitative epidemiological measurements on disease, and to model transitions between disease states using probabilistic relational modeling. We will compare our nosology with the well-known ICD-10 as well as ICD-11, under development. We will develop novel visualization methods for the complex of edges and nodes seen in nosologies. We also plan to test our nosology in two Driving Biological Projects, in small cell lung cancer and immunology and disease, specifically yielding novel diagnostics and therapeutics ready for clinical trials.

Public Health Relevance

In this competitive renewal, building from 36 publications in the first funding period, we plan to create a new disease classification based on clinical, molecular, and epidemiological data and knowledge, and to use this classification to identify novel diagnostics and drugs for small cell lung cancer and immunological disease.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Long, Rochelle M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Schools of Medicine
United States
Zip Code
Pessetto, Ziyan Y; Chen, Bin; Alturkmani, Hani et al. (2017) In silico and in vitro drug screening identifies new therapeutic approaches for Ewing sarcoma. Oncotarget 8:4079-4095
Hadley, Dexter; Pan, James; El-Sayed, Osama et al. (2017) Precision annotation of digital samples in NCBI's gene expression omnibus. Sci Data 4:170125
Chen, Bin; Wei, Wei; Ma, Li et al. (2017) Computational Discovery of Niclosamide Ethanolamine, a Repurposed Drug Candidate That Reduces Growth of Hepatocellular Carcinoma Cells InĀ Vitro and in Mice by Inhibiting Cell Division Cycle 37 Signaling. Gastroenterology 152:2022-2036
Chen, Bin; Ma, Li; Paik, Hyojung et al. (2017) Reversal of cancer gene expression correlates with drug efficacy and reveals therapeutic targets. Nat Commun 8:16022
Schmajuk, Gabriela; Tonner, Chris; Trupin, Laura et al. (2017) Using health-system-wide data to understand hepatitis B virus prophylaxis and reactivation outcomes in patients receiving rituximab. Medicine (Baltimore) 96:e6528
Kodama, Keiichi; Zhao, Zhiyuan; Toda, Kyoko et al. (2016) Expression-Based Genome-Wide Association Study Links Vitamin D-Binding Protein With Autoantigenicity in Type 1 Diabetes. Diabetes 65:1341-9
Chen, B; Butte, A J (2016) Leveraging big data to transform target selection and drug discovery. Clin Pharmacol Ther 99:285-97
Paik, H; Chen, B; Sirota, M et al. (2016) Integrating Clinical Phenotype and Gene Expression Data to Prioritize Novel Drug Uses. CPT Pharmacometrics Syst Pharmacol 5:599-607
Bagley, Steven C; Sirota, Marina; Chen, Richard et al. (2016) Constraints on Biological Mechanism from Disease Comorbidity Using Electronic Medical Records and Database of Genetic Variants. PLoS Comput Biol 12:e1004885
Kosti, Idit; Jain, Nishant; Aran, Dvir et al. (2016) Cross-tissue Analysis of Gene and Protein Expression in Normal and Cancer Tissues. Sci Rep 6:24799

Showing the most recent 10 out of 77 publications