Systems biology methods have shown great promise in providing a better understanding of human disease, and in identifying new disease targets. Nonetheless, it remains extraordinarily difficult to identify causal genes in most genetic diseases, in particular highly polygenic disorders, for which current approaches are most limited. These methods also typically leave off once the target is identified, and further research transitions to traditional paradigms for drug discovery. We hypothesize (the 'phenolog'hypothesis) that the identification of equivalent gene networks in humans and model organisms will reveal new candidate disease genes and new model systems for diseases. Moreover, such models can lead to the possibility of pursuing drug discovery based on the networks in the model organisms. We suggest that because pathways can evolve and be repurposed in different organisms that phenologs, similar (or orthologous) gene networks that nonetheless produce different phenotypes, may be present, and that these phenologs provide a basis not just for screening against a single protein, but rather for the identification of and simultaneous drug discovery efforts against multiple different targets in parallel. As an example of the importance of appreciating the evolutionary repurposing of pathways, we identify a yeast model of angiogenesis, and its subsequent application to disease gene and drug discovery. The same theoretical framework suggests a mouse model of autism, a worm model of breast cancer, and more. Our major aim is to test the phenolog hypothesis, primarily using the yeast model to discover new angiogenesis genes &performing yeast-based compound screening to find new classes of anti-angiogenesis inhibitors, suitable as lead compounds for anti-cancer therapies. Phenologs offer the possibility of associating new genes with polygenic diseases, as well as opening up drug screens in model organisms and follow-up studies searching for genetic variation in the candidate disease genes. The theoretical phenolog framework thus has the potential to impact a wide variety of diseases and could potentially affect a large downstream community.

Public Health Relevance

Common diseases, such as coronary artery disease, diabetes, and autism, often arise from effects of many genes, and this polygenic nature complicates traditional methods of discovering causal genes. This grant proposes a novel approach for identifying candidate genes for polygenic diseases, with a specific focus on defects in angiogenesis, failures of which affect wound healing, cardiovascular disease, and tumor malignancy. Anti-angiogenesis drugs play important roles as anti-tumor agents, and the model we propose suggests a path for identifying both angiogenesis candidate genes and new anti-angiogenesis compounds. More generally, this work will increase our understanding of the genetic basis of polygenic diseases and will be a step towards developing genetic diagnostics for susceptibility to these debilitating diseases.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1-CBB-7 (EU))
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas Austin
Schools of Arts and Sciences
United States
Zip Code
Kwon, Taejoon; Chung, Mei-I; Gupta, Rakhi et al. (2014) Identifying direct targets of transcription factor Rfx2 that coordinate ciliogenesis and cell movement. Genom Data 2:192-194
Chung, Mei-I; Peyrot, Sara M; LeBoeuf, Sarah et al. (2012) RFX2 is broadly required for ciliogenesis during vertebrate development. Dev Biol 363:155-65
Boutz, Daniel R; Collins, Patrick J; Suresh, Uthra et al. (2011) Two-tiered approach identifies a network of cancer and liver disease-related genes regulated by miR-122. J Biol Chem 286:18066-78
Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine et al. (2011) MSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines. J Proteome Res 10:2949-58
Park, Yungki; Marcotte, Edward M (2011) Revisiting the negative example sampling problem for predicting protein-protein interactions. Bioinformatics 27:3024-8
Vogel, Christine; Abreu, Raquel de Sousa; Ko, Daijin et al. (2010) Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol Syst Biol 6:400
Wang, Peggy I; Marcotte, Edward M (2010) It's the machine that matters: Predicting gene function and phenotype from protein networks. J Proteomics 73:2277-89
Madsen, James A; Boutz, Daniel R; Brodbelt, Jennifer S (2010) Ultrafast ultraviolet photodissociation at 193 nm and its applicability to proteomic workflows. J Proteome Res 9:4205-14
Park, Yungki (2009) Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences. BMC Bioinformatics 10:419