NISC currently operates the following suite of production sequencing machines: 1 PacBio RS II, 4 HiSeq2500s, and 3 MiSeqs. Using these platforms, we have generated over 620 billion reads in the past year alone. Though we remain consistently at a level of a mid-scale genome sequencing center, we have maintained advantageous economies of scale while remaining relatively agile. In keeping with the Comparative Sequencing interests, several years ago NISC implemented an amplicon-based Sanger sequencing pipeline designed to focus on intra-species variation. Numerous clinically relevant projects were designed to amplify and sequence specific genes and regions of interest in small groups of human subjects, yielding great insights into disease related genotype/phenotype combinations. As an early model for application of genome sequence data to medical research our flagship ClinSeq Project greatly advanced the study of atherosclerosis by providing sequence data for 250 genes in over 500 volunteers (www.genome.gov/20519355). While this approach was extremely productive, we evaluated and then adopted the NextGen sequencing platform to more efficiently and rapidly collect whole exome data for the ClinSeq Project, followed by many other medically relevant projects. As a consequence of these advances, NISC no longer offers Sanger-based amplicon targeted sequencing in production mode. One publication related to our earlier Sanger-based efforts is listed in the publications section of this report (8). The adoption of many new sequencing protocols in production created the commensurate need for dramatic changes to sample tracking, flow control and primary analysis pipelines, as well as, project management and cost accounting. Rapid design, development and implementation of new Laboratory Information Management System (LIMS) by a dedicated NISC team has met the initial challenges and continues to evolve quickly to adapt to a continuous flow of changes in sequencing technologies. A combination of talented IT staff and bioinformaticians have met the challenges of extremely large and complex data sets by implementing and continuously adapting pipeline programs to support rapidly evolving software associated with each of the sequencing platforms. Beyond primary analysis that results in DNA basecalls and quality scores, NISC has worked closely with members of other NHGRI research groups to implement and support high-throughput production of biologically relevant secondary analysis. One shining example of these efforts is the production scale processing of Whole Exome Sequencing (WES) data to all of our clients, the end product of which is distilled sets of variants of interest that are accessible in user-friendly fashion by the use of the in-house developed VarSifter program. The success of these programs has led to an increasing number of projects from a growing number of investigators. Last year we added a CLIA compliant pipeline for WES of samples originating from the NIH Clinical Center through the Clinical Center Genomics Opportunity program (www.genome.gov/27558725) and have completed sequencing of over 100 samples. Just recently, the Exome with Secondary Findings Analysis Test was CLIA certified. The implementation of improved project management tools is helping to address the challenges associated with such growth, which is now yielding results as publications for WES (n = 6) (4, 5, 7, 10, 11, 14), Whole Genome Sequencing, Assembly and/or Annotation (n = 1) ( 1), RNAseq (n = 1) (13), microbiome study (n = 1) (6), and HIV antibody studies (n=4) (3, 9, 12, 15). Finally, of all the publications over the last year, the one that pushed the envelope on use of cutting-edge sequencing technologies was the study of plasmid diversity from a hospital associated outbreak of carbapenemase producing enterobacteriaceae (2). For this study we used the PacBio RSII single-molecule sequencer to generate complete and highly accurate assemblies of these bacterial genomes along with their associated plasmids. A key aspect of this project was the exhaustive validation of the PacBio assembled genomes conducted at NISC which involved expert manual review, genome-wide comparison of sequence data from an orthogonal sequencing technology, and optical mapping to confirm the long-range structure of the assemblies. This analysis estimated the assemblies generated by PacBio were at least 99.9999% accurate, thus providing key evidence that the PacBio system is a powerful method to generate complete and accurate genomes of drug-resistant bacteria. In the foreseeable future, NISC is well positioned to provide next-gen sequence data for several large, multi-year projects, for example, the Skin Microbiome Project, and Mouse Methylome Project, a collaboration with NIEHS, as well as expanding the access to sequencing by Intramural NHGRI investigators through continued sequencing support of their most promising projects. Our focus is to increase operational efficiencies of the next-gen pipeline, refine existing protocols, implement additional protocols as new sample/experimental types are requested from researchers and continue to expand the value added data analysis packages available. The PacBio sequencing instrument is now in production for a variety of library types, and we are testing specific applications for the BioNano Genomics/Irys physical restriction mapping platform. Furthermore, we will continue to monitor developments in the rapidly evolving sequencing and informatics technologies, implementing those we deem most appropriate for the sequence data we produce for collaborating investigators.

Project Start
Project End
Budget Start
Budget End
Support Year
15
Fiscal Year
2015
Total Cost
Indirect Cost
Name
Human Genome Research
Department
Type
DUNS #
City
State
Country
Zip Code
Le Gallo, Matthieu; Rudd, Meghan L; Urick, Mary Ellen et al. (2018) The FOXA2 transcription factor is frequently somatically mutated in uterine carcinosarcomas and carcinomas. Cancer 124:65-73
Duncan, Christopher G; Grimm, Sara A; Morgan, Daniel L et al. (2018) Dosage compensation and DNA methylation landscape of the X chromosome in mouse liver. Sci Rep 8:10138
Zhou, Tongqing; Zheng, Anqi; Baxa, Ulrich et al. (2018) A Neutralizing Antibody Recognizing Primarily N-Linked Glycan Targets the Silent Face of the HIV Envelope. Immunity 48:500-513.e6
Weingarten, Rebecca A; Johnson, Ryan C; Conlan, Sean et al. (2018) Genomic Analysis of Hospital Plumbing Reveals Diverse Reservoir of Bacterial Plasmids Conferring Carbapenem Resistance. MBio 9:
Strongin, Anna; Heller, Theo; Doherty, Dan et al. (2018) Characteristics of Liver Disease in 100 Individuals With Joubert Syndrome Prospectively Evaluated at a Single Center. J Pediatr Gastroenterol Nutr 66:428-435
Roessler, Erich; Hu, Ping; Marino, Juliana et al. (2018) Common genetic causes of holoprosencephaly are limited to a small set of evolutionarily conserved driver genes of midline development coordinated by TGF-?, hedgehog, and FGF signaling. Hum Mutat 39:1416-1427
Gourh, Pravitt; Remmers, Elaine F; Boyden, Steven E et al. (2018) Brief Report: Whole-Exome Sequencing to Identify Rare Variants and Gene Networks That Increase Susceptibility to Scleroderma in African Americans. Arthritis Rheumatol 70:1654-1660
Randall, Thomas A; Mullikin, James C; Mueller, Geoffrey A (2018) The Draft Genome Assembly of Dermatophagoides pteronyssinus Supports Identification of Novel Allergen Isoforms in Dermatophagoides Species. Int Arch Allergy Immunol 175:136-146
Kimble, Danielle C; Lach, Francis P; Gregg, Siobhan Q et al. (2018) A comprehensive approach to identification of pathogenic FANCA variants in Fanconi anemia patients and their families. Hum Mutat 39:237-254
Harris, Melissa L; Fufa, Temesgen D; Palmer, Joseph W et al. (2018) A direct link between MITF, innate immunity, and hair graying. PLoS Biol 16:e2003648

Showing the most recent 10 out of 209 publications