Unravelling the genetic basis of human health and disease requires high-quality genome information. Heretofore, the community has relied on a single, haploid human reference sequence that does not represent the actual genome sequence of any single person. While this resource has been enormously beneficial for mapping many of the genetic variants responsible for human normal and disease variation, it can limit many types of analyses. Further, this single-reference model can cause uneven power to analyze genetic variation across all human populations. Similarly, non-human primate genomes are also fundamentally important for understanding our own genome. While many draft primate genome assemblies are available, including for all great apes, these genomes are all of lower quality and contiguity than the human reference genome. Importantly, chromosome-scale scaffolding of these genomes was often done by comparison to the human reference. While this approximation is generally correct, knowing where this is wrong is critically important. Using a radically innovative and simple approach, we can now generate highly contiguous de novo assemblies of human and non-human primate genomes. The approach requires sub- microgram quantities of DNA and can be carried done from start to finish within a few months, including sequencing time. Our approach uses genome contiguity information as derived from proximity ligation of in vitro assembled chromatin. It harnesses the speed and cost-effectiveness of high-throughput sequencing to generate large amounts of haplotype-phased contiguity data spanning well over 100 kilobases in length. Using this approach we will generate de novo assembled genomes from 50 humans and 12 non-human primates of high accuracy, partially haplotype phased, with scaffold N50s expected to be between 10 and 20 Mb in length.
The human genome sequence is a critically important resource for biomedical discovery. Using an innovative approach we will generate genome data from humans and non-human primates that will augment and extent our knowledge of human genome variation.
Lazar, Nathan H; Nevonen, Kimberly A; O'Connell, Brendan et al. (2018) Epigenetic maintenance of topological domains in the highly rearranged gibbon genome. Genome Res 28:983-997 |
Kuderna, Lukas F K; Tomlinson, Chad; Hillier, LaDeana W et al. (2017) A 3-way hybrid approach to generate a new high-quality chimpanzee reference genome (Pan_tro_3.0). Gigascience 6:1-6 |