The centromere is a cytologically (visually) defined entity with highly conserved functions. It is responsible for chromosomal sister chromatid cohesion and is the site for the assembly of kinetochore, a protein structure which links chromosomes to spindle fibers during cell division. In most multicellular eukaryotes, the centromeres are embedded within heterochromatin, a tightly packed form of DNA, and are associated with long tracts of satellite repeats in which genes are largely absent. Thus, the centromere is traditionally viewed as a highly heterochromatic and genetically silent chromosomal domain.
Recent research work from the centromere of rice chromosome 8 (Cen8) as well as the neocentromeres (newly forming centromeric regions) from humans suggest that centromeres originated from genic regions. This project will focus on centromere evolution using rice Cen8 as a model. Specifically, this project will develop bacterial artificial chromosome (BAC)-based physical maps that span Cen8 in six diploid wild rice species and produce approximately one megabase of DNA sequence from each of the six centromeres. The functional boundary of the six centromeres will be determined using cytogenetic and molecular methods. The structural rearrangements and sequence divergence associated with the six centromeres will be revealed by comparative sequence analysis. Expression and evolution of Cen8-associated genes will be explored to understand how such genes adapt to a unique chromosomal domain. The results from this project will shed light on centromere evolution and will also build a foundation to study the evolution of genes located in heterochromatic and recombinationally-suppressed chromosomal domains.
Access to Project Outcomes Sequence data and gene annotation results will be made available to the public through GenBank, the project website (accessible through www.omap.org/) and community databases such as Gramene (www.gramene.org).
The centromere is the chromosomal domain that directs proper segregation and transmission of the chromosomes. In most multicellular eukaryotes, the centromeres are embedded within long arrays of repetitive DNA sequences, often tandemly repeated sequences that are also called satellite repeats. Thus, the centromere is considered a ‘genetically silent’ chromosomal domain. However, we have demonstrated that the centromere of rice chromosome 8 (Cen8) contains a very limited amount of satellite repeats, which account <10% of the functional core of Cen8 (750 kilobases, kb). The majority of the DNA sequences Cen8 is not significantly different from any average genomic DNA sequences within the rice genome, suggesting that centromeres may have originated from genic regions. We hypothesize that rice Cen8 may represent an intermediate stage in the progression from a new centromere (neocentromeres) to a mature centromere. In this project, we sequenced the Cen8 of five different wild rice relatives. These relatives diverged from the cultivated rice for 0.5-10 million years. Thus, we generated an unprecedented sequence resource for comparative and evolutionary studies of a centromere among a group of genetically related eukaryotic species. We made several novel discoveries from the sequence data: (1) We determined the functional core within each Cen8 by mapping the DNA sequences associated with the centromere-specific protein CENH3. We discovered that the Cen8s in different species maintained a similar size even though the DNA sequences are significantly diverged. (2) Dramatic sequence divergence and rearrangements were discovered in the Cen8 in different wild rice species. A 155-bp satellite repeat, CentO, and a centromere-specific retrotransposon, CRR, are the main DNA components in rice centromeres. However, centromeric satellite repeats were not discovered in O. granulata, one of the wild rice species. O. granulata represent a rare example that the centromeres lack any satellite repeats. CRR or CRR-related elements were discovered in the centromeres of all species, except for O. brachyantha. Instead, we discovered a new retrotransposon, FRetro3, in O. brachyantha centromeres. FRetro3 is highly enriched in centromeres and is not present other rice species. These results showed that the entire centromere can be completely replaced by a single repeat, either a satellite repeat or a retrotransposon, in a few million years of evolution. (3). We discovered a set of seven Cen8 genes conserved in rice and two wild species, O. glaberrima and O. brachyantha, that diverged from rice one and ten million years ago, respectively. Surprisingly, all seven genes were found to be under purifying selection, representing a striking phenomenon of active gene survival within a ‘genetically silent’ (recombination-free) zone over a long evolutionary time. The coding sequences of the Cen8 genes showed sequence divergence and mutation rates that were significantly lower than those located in non-centromeric chromosomal arms, suggesting a mechanism that allows to maintain the fidelity and functionality of the Cen8 genes. These centromeric genes provide a foundation for future studies of how genes survive in recombination-suppressed chromosomal domains.