1 In the field of genetics, genome-wide association studies of common variants (GWAS) and exome sequencing- 2 based analyses are a common strategy to elucidate the relationship between genetic variants and a specific 3 phenotype. While these approaches have strengths, they also have significant limitations such as their inability 4 to identify complex biological interactions that lead to genetic predispositions, their inability to integrate distinct 5 but related phenotypes, and their inability to separate genetic variants effects by tissue. If a phenotype is 6 manifest only as a result of the complex interplay of multiple factors, it can be impossible to successfully isolate 7 individual parts by investigating genotype-phenotype associations for only one outcome trait or disease alone. 8 To affect a disease, drugs need to act on the right target and in the right tissue. Bioinformatics approaches that 9 integrate multiple key layers of information to reveal effective drugs will address a critical unmet need because 10 it is expected that a complex interplay of factors forms the basis for most human phenotypes and diseases. 11 The overall objective of this proposal is the development of algorithms that integrate gene and phenome-wide 12 association results with chromosome structure data and functional relationship networks to identify genes that 13 give rise to complex phenotypes and drugs that modify them. These algorithms will provide a new and unique 14 means to study the genetic etiology of complex traits and outcomes, increasing the interpretability of and 15 ultimately the insights generated from high throughput association testing. The proposal's rationale is that 16 robust tissue-specific methods will open the door for geneticists, researchers with biorepositories, and those 17 with access to other extensive phenotyping data to effectively reposition drugs and identify new targets. 18 Complementary algorithms to address distinct aspects of this challenge are proposed as specific aims:
(AIM 1) 19 Development of algorithms that integrate exome sequencing results with biological networks to identify genes 20 and pathways associated with phenotypes in specific tissues;
(AIM 2) Development of algorithms that integrate 21 3D genome structure with robust associations via biological networks to identify genes underlying phenotypes 22 in specific tissues;
(AIM 3) Development of algorithms that identify drugs that specifically alter regions of gene- 23 gene networks associated with a complex phenotype. Methods will be applied to phenome-wide analysis of the 24 Geisinger Health System MyCode biorepository and a subset of candidates will be validated via molecular 25 assays. 26 The outcomes of this grant, namely algorithms for tissue-specific network analysis of genes and drugs, are 27 expected to generate positive translational impact because such algorithms enable researchers to translate 28 existing data resources into causal genes and effective drugs.

Public Health Relevance

The proposed research is relevant to public health because algorithms that identify the genes that play roles in common human diseases such as diabetes and hypertension and new potential drugs that might target them allow researchers and doctors to better prevent, predict, diagnose, and treat these diseases. Specifically, these algorithms are expected to open new research horizons, because researchers who identify genes that cause a disease and compounds that may treat the disease can design experiments to test potential therapies. The proposed algorithms, which link phenotypic and genomic information with tissue-specific networks constructed by the integration of diverse data and databases to identify new drugs, are particularly relevant to the NIH's mission and particularly timely with the funding of the Precision Medicine Initiative.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Sofia, Heidi J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pennsylvania
Schools of Medicine
United States
Zip Code
Park, YoSon; Greene, Casey S (2018) A parasite's perspective on data sharing. Gigascience 7: