Although numerous genomes, including the human genome, have been completely sequenced, the specific function of the most of the DNA remains unknown. Identifying all the functional components of genomes has become an important goal of the NIH (e.g., via the ENCODE and modENCODE initiatives). A significant fraction of this DNA is believed to be involved in regulating gene expression, a fundamental process that plays key roles in both normal development and in disease. A basic unit for gene regulation is the cis-regulatory module (CRM;often referred to as an """"""""enhancer""""""""), but identification of these modules on a genomic scale has proven difficult. For the most part, computational methods for CRM discovery have been effective only in those situations where there is already an extensive body of knowledge about the transcription factors that bind to the CRMs, and the sequences (motifs) to which they bind. In this proposal, we develop novel computational tools for CRM discovery. In particular, we depart from current approaches to CRM discovery by developing algorithms that do not rely on prior knowledge of transcription factor binding motifs. By doing so, we are able to identify CRMs even in less well-studied biological contexts where significant prior knowledge is minimal or lacking. We then expand upon this approach by additionally developing methods that utilize partial prior knowledge of CRMs known to be involved in a particular biological process. We will combine our new methods with promising existing approaches to generate a computational pipeline that uses complementary strategies for sensitive and specific CRM discovery, and conduct extensive prediction of CRMs that function in many tissues and cell types. We will take advantage of the powerful genomic and experimental resources available for the model organism Drosophila melanogaster to subject all of our methods to validation both in silico and in vivo, using a large body of existing CRM data that we have compiled and extensive empirical testing in transgenic animals, respectively. The methods we develop here will be instrumental in helping to identify an important class of genomic functional element, the cis-regulatory module, in any metazoan genome. cis-Regulatory modules (CRMs) are key mediators of normal phenotypic variation, drivers of evolutionary change, and causes of birth defects as well as chronic and acute disease. Identifying CRMs genome-wide is an important first step on the way to comprehending both normal and pathological aspects of gene regulation and gene function with broad implications for understanding disease, predicting disease risk, and preventing and curing disease.
Suryamohan, Kushal; Hanson, Casey; Andrews, Emily et al. (2016) Redeployment of a conserved gene regulatory network during Aedes aegypti development. Dev Biol 416:402-13 |
Blatti, Charles; Kazemian, Majid; Wolfe, Scot et al. (2015) Integrating motif, DNA accessibility and gene expression data to build regulatory maps in an organism. Nucleic Acids Res 43:3998-4012 |
Suryamohan, Kushal; Halfon, Marc S (2015) Identifying transcriptional cis-regulatory modules in animal genomes. Wiley Interdiscip Rev Dev Biol 4:59-84 |
Duque, Thyago; Samee, Md Abul Hassan; Kazemian, Majid et al. (2014) Simulations of enhancer evolution provide mechanistic insights into gene regulation. Mol Biol Evol 31:184-200 |
Blatti, Charles; Sinha, Saurabh (2014) Motif enrichment tool. Nucleic Acids Res 42:W20-5 |
Atkinson, Taylor J; Halfon, Marc S (2014) Regulation of gene expression in the genomic context. Comput Struct Biotechnol J 9:e201401001 |
Samee, Md Abul Hassan; Sinha, Saurabh (2014) Quantitative modeling of a gene's expression from its intergenic sequence. PLoS Comput Biol 10:e1003467 |
Kazemian, Majid; Suryamohan, Kushal; Chen, Jia-Yu et al. (2014) Evidence for deep regulatory similarities in early developmental programs across highly diverged insects. Genome Biol Evol 6:2301-20 |
Kazemian, Majid; Pham, Hannah; Wolfe, Scot A et al. (2013) Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development. Nucleic Acids Res 41:8237-52 |
Cheng, Qiong; Kazemian, Majid; Pham, Hannah et al. (2013) Computational identification of diverse mechanisms underlying transcription factor-DNA occupancy. PLoS Genet 9:e1003571 |
Showing the most recent 10 out of 25 publications