Comparative Genomics Unit Research

Mullikin, James

Abstract

Medical Sequencing My group is involved in the analysis of the large-scale medical sequencing (LSMS) data that is now running at full scale operation at NISC (full scale is currently 20-30 exomes per week). To generate targeted exome sequence of human samples we use the following approach. Whole exome libraries compatible with Illumina paired-end sequencing is prepared using the standard Illumina protocol for each sample. Exome capture is performed using the SureSelect Human All Exon Kit (Agilent Technologies, cat. No. G3362F-001). This kit targets 38 Mb of the human genome corresponding to the NCBI Consensus CDS database (CCDS) plus over 700 small RNAs and more than 300 non-coding RNAs. Sequencing is performed using the Illumina GAiiX producing paired-end 100 base reads. We developed a whole exome variant analysis pipeline for aligning read pairs to the human reference sequence and calling both single nucleotide and small deletion/insertion variants using a Bayesian genotyping algorithm called MPG (for Most Probable Genotype). In order to obtain high-confidence genotypes across at least 85% of the bases targeted by our capture protocol, we generally sequence at least 20,000,000 read pairs, obtaining coverages greater than 60x in bases with phred quality score of 20 or above. Novel variants and sample genotypes for known variants, along with dbSNP identifiers and predictions of deleterious impact on protein function are stored in an Oracle database for subsequent comparison and reporting. In cases where multiple members of a family have been sequenced, Mendelian filters are used to narrow down regions of interest where disease-causing variants might lie. Comparison of genotypes for these samples to those obtained from an Illumina 2.5M SNP chip have shown >99.8% concordance, indicating a false discovery rate of less than 1%. We accommodate a number of whole-exome sequencing WES projects through the NextGen Sequencing pipeline. The largest project over this reporting year is ClinSeq. We now have WES results on over 250 human genomic DNA samples, and 150 of those are of ClinSeq subjects. This effort is in collaboration with Dr. Les Biesecker. The second largest project is the Undiagnosed Diseases Program with WES data on 50 samples. The other WES datasets are spread across numerous smaller projects. A review article describing the current methods for whole exome sequencing is available, see publication Teer and Mullikin, 2010. Other collaborations In collaboration with Dr Margulies, we developed a new approach for genome assembly from short reads using reduced representation libraries. This effort brought together a number of technologies, see publication Young, et al, 2010. In collaboration with Dr. Schuster, I de novo assembled the genome of a Kalihari Bushman individual from GS454 sequence data, see publication Schuster, et al, 2010. As part of the ClinSeq project, we identified a novel LDLR mutation, and the importance of specifying both DNA and protein mutation as just the protein mutation is ambiguous, see Ng et al, 2010. In collaboration with Drs. Brockman, Smith and OBrien, a greatly improved assembly and SNP map was developed for the cat genome, see Mullikin, et al, 2010. In collaboration with Dr. Drayna, mutations involved in persistent stuttering were identified, see Kang, et al., 2010. In collaboration with Dr. Biesecker, targeted exome sequencing of the X chromosome using next generation sequencing identified RBM10 as the gene that causes a syndromic form of cleft palete, see Johnston et al. 2010 In collaboration with Dr Paabo, we studied the neandertal genome as contrasted to the human genome and found human lineage specific changes since the divergence of us and our closest extinct hominid species, and that there was introgression between modern humans and neandertals when they coexisted in the middle east 80-50 kya. This introgression signal is apparent in three out-of-africa individuals from different ancestral populations, and not observed from two sub-Saharan Africa individuals from two different populations, see Green et al. 2010.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Investigator-Initiated Intramural Research Projects (ZIA)
Project #: 1ZIAHG200330-06
Application #: 8149441
Study Section

Project Start
Project End
Budget Start
Budget End
Support Year: 6
Fiscal Year: 2010
Total Cost: $1,182,109
Indirect Cost

Institution

Name: National Human Genome Research Institute
Department
Type
DUNS #

City
State
Country
Zip Code

Related projects

Publications

Le Gallo, Matthieu; Rudd, Meghan L; Urick, Mary Ellen et al. (2018) The FOXA2 transcription factor is frequently somatically mutated in uterine carcinosarcomas and carcinomas. Cancer 124:65-73

Chen, Y-C; Sudre, G; Sharp, W et al. (2018) Neuroanatomic, epigenetic and genetic differences in monozygotic twins discordant for attention deficit hyperactivity disorder. Mol Psychiatry 23:683-690

Randall, Thomas A; Mullikin, James C; Mueller, Geoffrey A (2018) The Draft Genome Assembly of Dermatophagoides pteronyssinus Supports Identification of Novel Allergen Isoforms in Dermatophagoides Species. Int Arch Allergy Immunol 175:136-146

Gandolfi, Barbara; Alhaddad, Hasan; Abdi, Mona et al. (2018) Applications and efficiencies of the first cat 63K DNA array. Sci Rep 8:7024

Serrano Negron, Yazmin L; Hansen, Nancy F; Harbison, Susan T (2018) The Sleep Inbred Panel, a Collection of Inbred Drosophila melanogaster with Extreme Long and Short Sleep Duration. G3 (Bethesda) 8:2865-2873

Harbison, Susan T; Serrano Negron, Yazmin L; Hansen, Nancy F et al. (2017) Selection for long and short sleep duration in Drosophila melanogaster reveals the complex genetic network underlying natural variation in sleep. PLoS Genet 13:e1007098

Le Gallo, Matthieu; Rudd, Meghan L; Urick, Mary Ellen et al. (2017) Somatic mutation profiles of clear cell endometrial tumors revealed by whole exome and targeted gene sequencing. Cancer 123:3261-3268

Kwon, Erika M; Connelly, John P; Hansen, Nancy F et al. (2017) iPSCs and fibroblast subclones from the same fibroblast population contain comparable levels of sequence variations. Proc Natl Acad Sci U S A 114:1964-1969

Dewan, Ramita; Pemov, Alexander; Dutra, Amalia S et al. (2017) First insight into the somatic mutation burden of neurofibromatosis type 2-associated grade I and grade II meningiomas: a case report comprehensive genomic study of two cranial meningiomas with vastly different clinical presentation. BMC Cancer 17:127

Ng, David; Hong, Celine S; Singh, Larry N et al. (2017) Assessing the capability of massively parallel sequencing for opportunistic pharmacogenetic screening. Genet Med 19:357-361

Showing the most recent 10 out of 141 publications

Comments

Be the first to comment on James Mullikin's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: