Centromeres and pseudoautosomal regions (PARs) are highly specialized chromatin domains that are essential for proper chromosome segregation. Centromeres provide chromosomal points of attachment to the cellular segregation machinery, linking chromosomes to the proteins that pull them to the cell poles during both somatic and germline cell divisions. The PAR is a region of conserved sequence identity between the X and Y chromosomes over which the meiotic program of pairing, synapsis, and recombination unfolds to ensure correct sex chromosome segregation. Mutations that disrupt centromere integrity or reduce homology between X- and Y-linked PARs can lead to chromosome segregation errors and constitute important genetic mechanisms for cancer, cellular senescence, and infertility. Despite their fundamental significance for chromosome transmission and genome stability, little is known about the levels and patterns of genetic diversity across centromeres and the PAR or the biological impacts of this variation. The repetitive sequence content of these regions poses a major barrier to their molecular analysis, and the PAR and centromeres remain unassembled or incompletely assembled on many of the highest quality reference genomes. My group has recently developed experimental and bioinformatic tools that will allow us to catalog variation across the PAR and centromeres, setting the stage for subsequent investigations into the functional consequences of genetic variation across these loci. Over the next five years, we will combine these analytical tools with diverse mouse models, cytogenetic investigations of chromosomes, and evolutionary analyses to address three critical questions. First, what it is the extent of DNA sequence variation across these chromatin domains? We will combine targeted long-read sequencing, re-analysis of genomic data in public archives, and analyses of the frequency of specific nucleotide ?words? in collections of shot-gun sequenced reads to catalog PAR and centromere diversity in a mammalian model system, including variation in size, genomic architecture, nucleotide sequence, and repeat content. Second, how do allelic differences in PAR and centromere sequences impact their intrinsic chromatin-dependent functions in chromosome segregation and fertility? We will test explicit hypotheses about how variation at the PAR and centromeres influences fertility and biases chromosome transmission to quantify relationships between DNA sequence diversity and function. Third, what mechanisms safeguard the chromatin-based functions of these loci in the face of their rapid sequence-level evolution? We will explore possible resolutions to this perplexing duality by elucidating how nave DNA sequence acquires chromatin-dependent functions using mouse models with spontaneous PAR expansions. Overall, the success of this project will significantly advance our understanding of diversity, evolution, and function at two loci with critical biological roles in chromosome segregation that arise not from products of their DNA sequence, but rather the intrinsic properties of their chromatin.

Public Health Relevance

Centromeres and the mammalian pseudoautosomal region are highly specialized chromatin domains that carry out crucial roles in chromosome segregation. Due to the repeat-rich DNA content of these loci and their persistence as gaps on most reference genome assemblies, remarkably little is known about the scope of their genetic variability among individuals or the consequences of this variation for human health. This program will leverage new sequencing technologies and implement novel bioinformatic approaches to interrogate DNA- sequence variation across the PAR and centromeres in diverse mouse models and determine the effects of this variation on fertility and the fidelity of chromosome segregation.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Unknown (R35)
Project #
5R35GM133415-02
Application #
9984497
Study Section
Special Emphasis Panel (ZGM1)
Program Officer
Gaillard, Shawn R
Project Start
2019-08-01
Project End
2024-07-31
Budget Start
2020-08-01
Budget End
2021-07-31
Support Year
2
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Jackson Laboratory
Department
Type
DUNS #
042140483
City
Bar Harbor
State
ME
Country
United States
Zip Code
04609