Over the last year, NISC has put into production two additional NextGen DNA sequencing machines, an Illumina HiSeq2000 and a Roche GS454, making our current suite of production NextGen sequencing machines total 4 HiSeq2000s, 2 GAiiXs, 3 MiSeqs and 2 GS454s. Using these platforms, we have generated about 400 billion reads in the past year alone. Though we remain consistently at a level of a mid-scale genome sequencing center, we have maintained advantageous economies of scale while remaining relatively agile. While NISC has undertaken projects of many sizes and types throughout the years, from ESTs to SAGE sequencing, the NISC Comparative Sequencing Program has been the most productive over-arching success, beginning with the sequencing of mouse BACs orthologous to human chromosome 7 at the start of the mouse genome project and extending to over 75 species across numerous targets, including the flagship CFTR target that encompasses 1 MB of human chromosome 7. This BAC-based sequencing approach found great utility in scouting new genomes and for specialized targeting of complex genomic regions containing duplications and structural rearrangements that made them intractable by traditional genomic sequencing approaches. There were 3 BAC related publications this reporting period (5, 10, 16). In keeping with the Comparative Sequencing interests, several years ago NISC implemented an amplicon-based Sanger sequencing pipeline designed to focus on intra-species variation. Numerous clinically relevant projects were designed to amplify and sequence specific genes and regions of interest in small groups of human subjects, yielding great insights into disease related genotype/phenotype combinations. The flagship ClinSeq Project greatly advanced the study of atherosclerosis by providing sequence data for 250 genes in over 500 volunteers. While this approach was extremely productive, the combination of large volumes of high quality sequence data generated by the Illumina platform, along with efficient whole exome genomic enrichment techniques evaluated and adopted by NISC has allowed us to transition to an even more cost-effective approach that provides an increasingly comprehensive data set. As a consequence of these advances, NISC no longer offers Sanger-based amplicon targeted sequencing in production mode. Four publications related to those previous efforts are listed in the publications section of this report (2, 12, 17, 23). The adoption of many new sequencing protocols in production created the commensurate need for dramatic changes to sample tracking, flow control and primary analysis pipelines. Rapid design, development and implementation of new Laboratory Information Management System (LIMS) by a dedicated team has met the initial challenges and continues to evolve quickly to adapt to a continuous flow of changes. A combination of talented IT staff and bioinformaticians have met the challenges of extremely large and complex data sets by implementing and continuously adapting pipeline programs to support rapidly evolving software associated with each of the sequencing platforms. Beyond primary analysis that results in DNA basecalls and quality scores, NISC has worked closely with members of other NHGRI research groups to implement and support high-throughput production of biologically relevant secondary analysis. One shining example of these efforts is the production scale processing of Whole Exome Sequencing (WES) data to all of our clients, the end product of which is distilled sets of variants of interest that are accessible in user-friendly fashion by the use of the in-house developed VarSifter program. The success of these programs has led to an increasing number of projects from a growing number of investigators. The implementation of improved project management tools is helping to address the challenges associated with such growth, which is now yielding results as publications for WES (n = 6) (7, 13, 15, 20, 21, 24), miRNA (n = 1) (11), Whole Genome Sequencing, Assembly and/or Annotation (n = 4) (4, 14, 18, 25), custom targeted sequencing (n = 3) (1, 6, 22), RNAseq (n = 3) (1, 8, 18), microbiome studies (n = 3) (3, 19, 25), and HIV antibody studies (n=4) (9, 26, 27, 28). In the foreseeable future, NISC is well positioned to provide next-gen sequence data for several large, multi-year projects, for example, the Skin Microbiome Project, and Mouse Methylome Project, a collaboration with NIEHS, as well as expanding the access to sequencing by Intramural NHGRI investigators through a new internal review process for advancing the most promising projects. Our focus is to increase operational efficiencies of the next-gen pipeline, refine existing protocols, implement additional protocols as new sample/experimental types are requested from researchers and continue to expand the value added data analysis packages available. We are currently testing specific applications for new technologies including the Ion Torrent sequencing instrument, PacBio generated data as a foundation for microbial genome sequencing, and the OpGen/Argus physical restriction mapping platform. Furthermore, we will continue to monitor developments in the rapidly evolving sequencing and informatics technologies, implementing those we deem most appropriate for the sequence data we produce for collaborating investigators.

Project Start
Project End
Budget Start
Budget End
Support Year
13
Fiscal Year
2013
Total Cost
$9,119,432
Indirect Cost
Name
National Human Genome Research Institute
Department
Type
DUNS #
City
State
Country
Zip Code
Kapoor, Ashish; Sekar, Rajesh B; Hansen, Nancy F et al. (2014) An enhancer polymorphism at the cardiomyocyte intercalated disc protein NOS1AP locus is a major regulator of the QT interval. Am J Hum Genet 94:854-69
Pierson, Tyler Mark; Yuan, Hongjie; Marsh, Eric D et al. (2014) GRIN2A mutation and early-onset epileptic encephalopathy: personalized therapy with memantine. Ann Clin Transl Neurol 1:190-198
Bentley, Amy R; Chen, Guanjie; Shriner, Daniel et al. (2014) Gene-based sequencing identifies lipid-influencing variants with ethnicity-specific effects in African Americans. PLoS Genet 10:e1004190
LaFave, Matthew C; Varshney, Gaurav K; Vemulapalli, Meghana et al. (2014) A Defined Zebrafish Line for High-Throughput Genetics and Genomics: NHGRI-1. Genetics 198:167-70
Price, Jessica C; Pollock, Lana M; Rudd, Meghan L et al. (2014) Sequencing of candidate chromosome instability genes in endometrial cancers reveals somatic mutations in ESCO1, CHTF18, and MRE11A. PLoS One 8:e63313
Prufer, Kay; Racimo, Fernando; Patterson, Nick et al. (2014) The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505:43-9
Doria-Rose, Nicole A; Schramm, Chaim A; Gorman, Jason et al. (2014) Developmental pathway for potent V1V2-directed HIV-neutralizing antibodies. Nature 509:55-62
Prickett, Todd D; Zerlanko, Brad; Gartner, Jared J et al. (2014) Somatic mutations in MAP3K5 attenuate its proapoptotic function in melanoma through increased binding to thioredoxin. J Invest Dermatol 134:452-60
Sen, Shurjo K; Boelte, Kimberly C; Barb, Jennifer J et al. (2014) Integrative DNA, RNA, and protein evidence connects TREML4 to coronary artery calcification. Am J Hum Genet 95:66-76
Lee, M; Dworkin, A M; Gildea, D et al. (2014) RRP1B is a metastasis modifier that regulates the expression of alternative mRNA isoforms through interactions with SRSF1. Oncogene 33:1818-27

Showing the most recent 10 out of 37 publications