Bioinformatics is the application of statistics and computer science to the field of molecular biology. It has emerging as a field unto itself, as the datasets that are generated by modern biomedical researchers easily exceeds what can be directiy analyzed. Core C will work with the data generated from massive parallel sequencing from human, mouse and zebrafish, to extract variants that are potential to cause disease. The PIs of Cores A, B and C have worked together extensively in the past, and have an established track record of productivity in the area of next generation sequencing (NGS) data analysis. Dr. Bafna has worked broadly in bioinformatics and genomics in the development computational methodologies employing novel algorithms and statistical techniques for NGS datasets. We envision that the WES data generated from Core B will be delivered to Core C for extraction ofthe potentially deleterious sequence variants (PDSVs), which will be delivered back to each of the Projects for segregation analysis and further validation. This will be accomplished by developing the four key pipelines of Core C: 1] WES data tracking and storage pipeline, 2] WES data analysis pipeline, 3] Mutation identification pipeline, 4] Comparative genomics pipeline. The analysis of WES datasets is presented in this application as a series of filters that is applied to the primary sequence to extract all relevant variants, and then apply a heuristic ranking strategy to detect the PDSVs mostly likely associated with the phenotype. The output of this FILTER and PRIORITIZE programs are then reported as both SNPs and INDELs in a ranked fashion, for later validation and segregation testing. Further analysis will help uncover the contribution of these genes to common disease as well as genome- wide gene-gene interactions using other software we have developed. We are also well-positioned to take full advantage of the 3rd generation DNA sequencers, and are excited that UCSD will serve as one of the national HHMI PacBio Sequencing Centers. These tools, together with the outstanding and unique human and animal resources, will make for a powerful combination to investigate new causes of structural brain disorders.

Public Health Relevance

The Bioinformatics Core (Core C) will work with all Projects and Cores to integrate large datasets for analysis and ranking of likely causative variants. Core C will also maintain key metrics of next generation sequencing and report back deficiencies in coverage or systematic trends in data recovery. Core C will integrate new sequencing approaches in Core B and comparative genomic approaches to unify all projects.

National Institute of Health (NIH)
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZHD1-DSR-Y)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Diego
La Jolla
United States
Zip Code
Marin-Valencia, Isaac; Novarino, Gaia; Johansen, Anide et al. (2018) A homozygous founder mutation in TRAPPC6B associates with a neurodevelopmental disorder characterised by microcephaly, epilepsy and autistic features. J Med Genet 55:48-54
Schaffer, Ashleigh E; Breuss, Martin W; Caglayan, Ahmet Okay et al. (2018) Biallelic loss of human CTNNA2, encoding ?N-catenin, leads to ARP2/3 complex overactivity and disordered cortical neuronal migration. Nat Genet 50:1093-1101
Makrythanasis, Periklis; Maroofian, Reza; Stray-Pedersen, Asbjørg et al. (2018) Biallelic variants in KIF14 cause intellectual disability with microcephaly. Eur J Hum Genet 26:330-339
Breuss, Martin W; Nguyen, Thai; Srivatsan, Anjana et al. (2017) Uner Tan syndrome caused by a homozygous TUBB2B mutation affecting microtubule stability. Hum Mol Genet 26:258-269
De Mori, Roberta; Romani, Marta; D'Arrigo, Stefano et al. (2017) Hypomorphic Recessive Variants in SUFU Impair the Sonic Hedgehog Pathway and Cause Joubert Syndrome with Cranio-facial and Skeletal Defects. Am J Hum Genet 101:552-563
Marin-Valencia, Isaac; Gerondopoulos, Andreas; Zaki, Maha S et al. (2017) Homozygous Mutations in TBC1D23 Lead to a Non-degenerative Form of Pontocerebellar Hypoplasia. Am J Hum Genet 101:441-450
Friedman, Jennifer; Feigenbaum, Annette; Chuang, Nathaniel et al. (2017) Pyruvate dehydrogenase complex-E2 deficiency causes paroxysmal exercise-induced dyskinesia. Neurology 89:2297-2298
Koizumi, Hiroyuki; Fujioka, Hiromi; Togashi, Kazuya et al. (2017) DCLK1 phosphorylates the microtubule-associated protein MAP7D1 to promote axon elongation in cortical neurons. Dev Neurobiol 77:493-510
McConnell, Michael J; Moran, John V; Abyzov, Alexej et al. (2017) Intersection of diverse neuronal genomes and neuropsychiatric disease: The Brain Somatic Mosaicism Network. Science 356:
Lardelli, Rea M; Schaffer, Ashleigh E; Eggens, Veerle R C et al. (2017) Biallelic mutations in the 3' exonuclease TOE1 cause pontocerebellar hypoplasia and uncover a role in snRNA processing. Nat Genet 49:457-464

Showing the most recent 10 out of 70 publications