Inter- and Intra-Species Comparative Sequencing

Mullikin, James

Abstract

Over the last year, NISC operated the following suite of production sequencing machines: 1 PacBio Sequel, 1 HiSeq 2500, 1 NextSeq 550, 1 NovaSeq 6000, and 3 MiSeqs. Using these platforms, we have generated over 1,403 billion reads in the past year. Though we remain consistently at a level of a mid-scale genome sequencing center, we have maintained advantageous economies of scale while remaining relatively agile. The addition of the NovaSeq 6000 allows NISC to effectively meet the rising interest in studying whole genome sequence datasets, with over 500 human genomes sequenced and analyzed so far in 2019. The on-going adoption of new sequencing protocols in production creates the commensurate need for dramatic changes to sample tracking, flow control and primary analysis pipelines, as well as project management and cost accounting. A concerted effort to rapidly design, develop and implement a new Laboratory Information Management System (LIMS) by a dedicated NISC team has fulfilled these needs and continues to evolve quickly to adapt to a continuous flow of changes in sequencing technologies. A combination of talented IT staff and bioinformaticians have met the challenges of extremely large and complex data sets by implementing and continuously adapting pipeline programs to support rapidly evolving software associated with each of the sequencing platforms. Beyond primary analysis that results in DNA basecalls and quality scores, NISC has worked closely with members of other NHGRI research groups to implement and support high-throughput production of biologically relevant secondary analysis. One shining example of these efforts is the production scale processing of Whole Genome Sequencing (WGS) data for all of our clients, using the GATK4 best practices pipeline. To prepare for an increase in quantity of WGS samples, we are implementing new GPU-based hardware and software that will dramatically accelerate WGS data analysis in the near future. With these new systems in place, we expect to analyze WGS datasets at a rate of one human genome per hour, matching the maximum throughput of our NovaSeq 6000. Publications for fiscal-year 2019 span a wide range of projects, and are summarized as follows: 1) WES projects (n = 4) (Brooks, Zein et al. 2018, Sapp, Johnston et al. 2018, Jenkins, Almli et al. 2019, Pemov, Hansen et al. 2019) 2) Whole Genome Sequencing, Assembly and/or Annotation (n = 1) (Chen, Omori et al. 2019) 3) RNAseq, ChIPseq, and ATACseq (n = 2) (Zhang, Choi et al. 2018, Lawlor, Marquez et al. 2019) 4) Methylome (n = 1) (Grimm, Shimbo et al. 2019) 5) Microbiome study (n = 2) (Johnson, Deming et al. 2018, Tirosh, Conlan et al. 2018) 6) HIV and antibody study (n=1) (Kong, Duan et al. 2019) In the foreseeable future, NISC is well positioned to provide next-gen sequence data for a multitude of investigators across NIH. We also expect increasing access to sequencing by the NIH Clinical Center with our CLIA exome test, and continuing our sequencing support for Intramural NHGRI investigators for their most promising projects. Our focus is to increase operational efficiencies of the next-gen pipeline, refine existing laboratory protocols and analysis pipelines, implement additional protocols as new sample/experimental types are requested from researchers, and continue to expand the value-added data analysis packages available. We are also testing and implementing new technologies, with the Oxford Nanopore GridION being a major focus over the last year. With the GridION, we have generated extremely long reads, approaching 1,000,000 bases and N50 read-lengths over 70,000 bases, enabling a whole human genome assembly with contiguous sequence spanning telomere-to-telomere of chromosomes. In summary, we will continue to monitor developments in the rapidly evolving sequencing and informatics technologies, implementing those we deem most appropriate for our collaborating investigators.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Production Facilities Intramural Research (ZIB)
Project #: 1ZIBHG000196-19
Application #: 10020067
Study Section

Project Start
Project End
Budget Start
Budget End
Support Year: 19
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: National Human Genome Research Institute
Department
Type
DUNS #

City
State
Country
Zip Code

Related projects

Publications

Le Gallo, Matthieu; Rudd, Meghan L; Urick, Mary Ellen et al. (2018) The FOXA2 transcription factor is frequently somatically mutated in uterine carcinosarcomas and carcinomas. Cancer 124:65-73

Duncan, Christopher G; Grimm, Sara A; Morgan, Daniel L et al. (2018) Dosage compensation and DNA methylation landscape of the X chromosome in mouse liver. Sci Rep 8:10138

Zhou, Tongqing; Zheng, Anqi; Baxa, Ulrich et al. (2018) A Neutralizing Antibody Recognizing Primarily N-Linked Glycan Targets the Silent Face of the HIV Envelope. Immunity 48:500-513.e6

Weingarten, Rebecca A; Johnson, Ryan C; Conlan, Sean et al. (2018) Genomic Analysis of Hospital Plumbing Reveals Diverse Reservoir of Bacterial Plasmids Conferring Carbapenem Resistance. MBio 9:

Strongin, Anna; Heller, Theo; Doherty, Dan et al. (2018) Characteristics of Liver Disease in 100 Individuals With Joubert Syndrome Prospectively Evaluated at a Single Center. J Pediatr Gastroenterol Nutr 66:428-435

Roessler, Erich; Hu, Ping; Marino, Juliana et al. (2018) Common genetic causes of holoprosencephaly are limited to a small set of evolutionarily conserved driver genes of midline development coordinated by TGF-?, hedgehog, and FGF signaling. Hum Mutat 39:1416-1427

Gourh, Pravitt; Remmers, Elaine F; Boyden, Steven E et al. (2018) Brief Report: Whole-Exome Sequencing to Identify Rare Variants and Gene Networks That Increase Susceptibility to Scleroderma in African Americans. Arthritis Rheumatol 70:1654-1660

Randall, Thomas A; Mullikin, James C; Mueller, Geoffrey A (2018) The Draft Genome Assembly of Dermatophagoides pteronyssinus Supports Identification of Novel Allergen Isoforms in Dermatophagoides Species. Int Arch Allergy Immunol 175:136-146

Kimble, Danielle C; Lach, Francis P; Gregg, Siobhan Q et al. (2018) A comprehensive approach to identification of pathogenic FANCA variants in Fanconi anemia patients and their families. Hum Mutat 39:237-254

Harris, Melissa L; Fufa, Temesgen D; Palmer, Joseph W et al. (2018) A direct link between MITF, innate immunity, and hair graying. PLoS Biol 16:e2003648

Showing the most recent 10 out of 209 publications

Comments

Be the first to comment on James Mullikin's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: