Conserved Non-protein coding Elements (CNEs) are <1kb DNA elements deeply conserved across vertebrate genomes from zebrafish to human. While their role is not fully understood, they are prime candidates for cis- regulatory function and can act as enhancers. As some have been implicated in human biology and diseases, we developed a method to identify CNEs harboring risk SNPs identified in GWAS. Our method focused on CNE/SNPs regions deeply conserved across vertebrate genomes that also preserve gene synteny in their neighborhood to pinpoint potential cis regulated genes. Based on GWAS replications, we selected 20 CNE/SNPs pairs and their syntenic genes potentially contributing to 5 human traits (sleep/circadian activity, skin pigmentation, cardiovascular system, eye biology, body size and morphology) that can be modeled in zebrafish. Independent and in depth in vivo characterization of two CNEs (1 and 19) showed that (i) human CNE specific transcriptional enhancer activity can be revealed in live zebrafish, (ii) the risk SNP abolishes this activity, (iii) the genuine cis-regulated gene associated to the human trait can be discovered, and (iv) the underpinning human biology can be identified and studied by modeling the genetic defect in zebrafish. Based on these successful validations and the exciting promise of shedding light on the molecular and cellular biology underpinning human biological traits, we propose to test the central hypothesis that deeply conserved non-coding SNPs are regulatory genetic variants responsible for differences in gene expression and function that affect human health. This hypothesis will be tested via the following specific aims.
Aim 1 will determine the transcriptional activity of the remaining 18 conserved human CNEs and associated risk SNPs in vivo, and establish the mRNA patterns of the 34 syntenic neighbor genes. Among the latters, Aim 2 will identify the actual cis-regulated genes via systematic CRISPR/Cas9 editing of CNEs and mRNA (dys)regulation analysis. Finally, Aim 3 will identify the genetic and biological consequences of disrupting CNEs (deletion, introduction of risk SNP) and their cis- regulated genes (indels).
Aim 1 will use transgenesis in zebrafish to demonstrate that human CNEs are enhancers whose functions are disrupted by the risk SNPs.
Aim 2 will use CRISPR/Cas9-based genome editing in zebrafish to delete all 18 CNEs (DCNE) or introduce risk SNPs in the zebrafish genome (CNE*) to identify the syntenic neighbor genes that are cis-(dys)regulated.
Aim 3 will compare the consequences of enhancer mutants (DCNE, CNE*) with cis-regulated gene mutants to uncover the mechanisms underpinning the human biology and traits. The approach of using high-throughput CRISPR/Cas9-mediated genome editing in zebrafish to uncover the functional relevance of human CNE/SNPs is innovative. The proposed research is expected to be significant because it will establish the functional impact of non-coding genetic variants in human traits/diseases and will shed light on the associated human biology with in vivo genetic modeling in zebrafish.

Public Health Relevance

Vertebrate genomes from zebrafish to human share deeply conserved non-coding DNA elements (CNEs) that likely act as regulators of neighboring genes. We selected 20 of these DNA elements as they carry risk genetic variations (SNPs) for human health and diseases. In depth analysis of two elements in our list (CNE1 and 19) uncovered the actual genes and the biological process dysregulated in the human traits. Based on this successful proof-of-concepts we propose here to characterize in vivo the remaining 18 elements and associated genetic variations to uncover some of the genetic bases and biological mechanisms affected in 5 human traits for which we have extensive experience and assays.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genetics of Health and Disease Study Section (GHD)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Schools of Medicine
United States
Zip Code