The proper specification of the neuroectoderm, neural crest, and neural progenitor populations dictates the overall success of neurodevelopment, and disruption of these populations is at the root of many developmental diseases. Recent advances in single cell transcriptomics allow for the study of this process in tens of thousands of cells across the transcriptome. Unlike bulk sequencing, which averages expression across a population, single cell approaches are essential to studying a multifaceted, asynchronous process like neurodevelopment. However, several computational barriers exist to analyzing scRNA-seq data, including dropout, technical noise, and batch effects. Furthermore, studying developing neural tissue provides its own challenges because in many contexts, progenitor cells and mature progeny are present at the same time. Thus, gaining biological insight from these data requires development of novel computational methods. The overall goal of this proposal is to establish computational methods to study single cell gene expression during neurodevelopment, focusing on human embryoid bodies (EBs).
In aim 1, I will characterize the branching structure of germ layer and neural specification in a 27-day time course of EB culture. In my preliminary results, I used PHATE, a dimensionality reduction tool I co-developed, to describe a set of smooth transitions from stem cell through the three germ layers to several derivatives including progenitors of the cardiac, bone, and neural lineages. I will validate this structure of differentiation by FACS sorting and bulk RNA-sequencing of predicted neuroectoderm and neural progenitor cells.
In aim 2, I will describe the smooth changes of gene expression the drive specification of the neural lineages by inferring the latent developmental time in the EB time course. To assign each cell a unique single developmental time label, I developed a method called MELD (Manifold Enhancement of Latent Dimensions). In my preliminary work, I use MELD to recapitulate known patterns of gene expression during the specification of neural crest cells. In this aim, I will expand my analysis to the neuroectoderm and neural progenitor populations and confirm that MELD predicts intermediate cell states present during time points not sampled in the original time course. The work in this aim will generate a continuous roadmap of gene expression from stem cell through neural progenitor in EBs. In the third aim I will use mutual information to study edge-rewiring of transcription factors and their targets (how the regulatory relationships change during neurogenesis). By modelling the statistical dependency between regulatory factors and their targets as a continuous process, it will be possible to infer the crucial windows during which regulatory factors expression has the strongest influence on downstream targets. To confirm the timing of these windows, I will use doxycycline induction to show induction has a greater effect on downstream transcription within windows than outside them. Together, these insights will produce methods to gain a greater understanding of neurogenesis from single cell RNA-sequencing data.

Public Health Relevance

The proper specification of the neuroectoderm, neural crest, and neural progenitor populations dictates the overall success of neurodevelopment, and disruption of these populations is at the root of many developmental diseases and birth defects. Studying the dynamics of gene and gene regulation during this crucial developmental period will generate insight into the mechanisms of typical neurodevelopment and developmental diseases. This information will benefit genetic testing and counseling, as well as improve capabilities for personalized and regenerative medicine.

Agency
National Institute of Health (NIH)
Institute
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Type
Predoctoral Individual National Research Service Award (F31)
Project #
5F31HD097958-02
Application #
9899738
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Henken, Deborah B
Project Start
2019-04-01
Project End
2022-03-31
Budget Start
2020-04-01
Budget End
2021-03-31
Support Year
2
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Yale University
Department
Genetics
Type
Schools of Medicine
DUNS #
043207562
City
New Haven
State
CT
Country
United States
Zip Code
06520