This project will develop novel computational methods to leverage diverse sources of data sets, including the rich information generated from the NIH Common Fund projects, for drug repurposing, which may dramatically lower the risk of drug development by skipping early-stage trials, shorten time investment, and cut down capital investment. With the advancement of high-throughput sequencing and massively parallel technologies, more and more omics data are available for biomedical research. These genomics, transcriptomics, proteomics, metabolomics and microbiomics data can help biomedical researchers better understand the complex biological systems underlying human diseases from different perspectives. For example, genome-wide association and sequencing studies have successfully identified tens of thousands of variants that are significantly associated with one or more complex traits. Despite these great successes, the results have not been fully translated into potential clinical value. The overall goal of this pilot project is to leverage the rich information generated from the NIH Common Funds projects, in combination of other public data sets, to explore the feasibility of drug repurposing through novel computational approaches. The ultimate goal of our project is to develop, implement, and apply a computational framework to integrate data from the Common Fund projects and other resources to identify potential uses of existing drugs for new indications, and we will also make our newly developed tools available to the general research community. This will be accomplished through: [1] further development of a powerful framework proposed by our group to leverage cross-tissue information in the GTEx data to achieve higher accuracy in imputation of gene expression within each tissue and combine single-tissue association tests to derive a powerful test for gene-trait association using summary statistics from genome wide association studies; [2] development of a signature-matching-based drug repurposing framework with gene expression data from diverse sources (drug perturbation experiments, case control studies, and patient intervention studies) and GWAS summary statistics; and [3] implementation and application of the proposed framework to discover candidate drugs for repurposing to diseases in critical need of drug development, e.g. non-alcoholic steatohepatitis. With the completion of the pilot project, we will be able to assess the feasibility of the proposed framework for drug repurposing for further developments and implementations.

Public Health Relevance

We will develop novel computational methods for repurposing drugs. These methods will leverage different sources of data sets, including those of Genotype-Tissue Expression (GTEx) and Library of Integrated Network-based Cellular Signatures (LINCS), as well as the large collection of summary statistics from many genome wide association studies and gene expression profiling efforts through rigorous statistical modeling and inference. If successful, our approach may dramatically lower the risk of drug development by skipping early-stage trials, shorten time investment, and reduce capital investment.

National Institute of Health (NIH)
Office of The Director, National Institutes of Health (OD)
Small Research Grants (R03)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Resat, Haluk
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Yale University
Biostatistics & Other Math Sci
Schools of Public Health
New Haven
United States
Zip Code