Bioinformatics is the application of statistics and computer science to the field of molecular biology. It has emerging as a field unto itself, as the datasets that are generated by modern biomedical researchers easily exceeds what can be directiy analyzed. Core C will work with the data generated from massive parallel sequencing from human, mouse and zebrafish, to extract variants that are potential to cause disease. The PIs of Cores A, B and C have worked together extensively in the past, and have an established track record of productivity in the area of next generation sequencing (NGS) data analysis. Dr. Bafna has worked broadly in bioinformatics and genomics in the development computational methodologies employing novel algorithms and statistical techniques for NGS datasets. We envision that the WES data generated from Core B will be delivered to Core C for extraction ofthe potentially deleterious sequence variants (PDSVs), which will be delivered back to each of the Projects for segregation analysis and further validation. This will be accomplished by developing the four key pipelines of Core C: 1] WES data tracking and storage pipeline, 2] WES data analysis pipeline, 3] Mutation identification pipeline, 4] Comparative genomics pipeline. The analysis of WES datasets is presented in this application as a series of filters that is applied to the primary sequence to extract all relevant variants, and then apply a heuristic ranking strategy to detect the PDSVs mostly likely associated with the phenotype. The output of this FILTER and PRIORITIZE programs are then reported as both SNPs and INDELs in a ranked fashion, for later validation and segregation testing. Further analysis will help uncover the contribution of these genes to common disease as well as genome- wide gene-gene interactions using other software we have developed. We are also well-positioned to take full advantage of the 3rd generation DNA sequencers, and are excited that UCSD will serve as one of the national HHMI PacBio Sequencing Centers. These tools, together with the outstanding and unique human and animal resources, will make for a powerful combination to investigate new causes of structural brain disorders.

Public Health Relevance

The Bioinformatics Core (Core C) will work with all Projects and Cores to integrate large datasets for analysis and ranking of likely causative variants. Core C will also maintain key metrics of next generation sequencing and report back deficiencies in coverage or systematic trends in data recovery. Core C will integrate new sequencing approaches in Core B and comparative genomic approaches to unify all projects.

National Institute of Health (NIH)
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZHD1-DSR-Y)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Diego
La Jolla
United States
Zip Code
Marin-Valencia, Isaac; Guerrini, Renzo; Gleeson, Joseph G (2014) Pathogenetic mechanisms of focal cortical dysplasia. Epilepsia 55:970-8
Schaffer, Ashleigh E; Eggens, Veerle R C; Caglayan, Ahmet Okay et al. (2014) CLP1 founder mutation links tRNA splicing and maturation to cerebellar development and neurodegeneration. Cell 157:651-63
Marín, Oscar; Müller, Ulrich (2014) Lineage origins of GABAergic versus glutamatergic neurons in the neocortex. Curr Opin Neurobiol 26:132-41
Novarino, Gaia; Fenstermaker, Ali G; Zaki, Maha S et al. (2014) Exome sequencing links corticospinal motor neuron disease to common neurodegenerative disorders. Science 343:506-11
Thomas, Sophie; Wright, Kevin J; Le Corre, Stéphanie et al. (2014) A homozygous PDE6D mutation in Joubert syndrome impairs targeting of farnesylated INPP5E protein to the primary cilium. Hum Mutat 35:137-46
Kinsella, Marcus; Patel, Anand; Bafna, Vineet (2014) The elusive evidence for chromothripsis. Nucleic Acids Res 42:8231-42
Ronen, Roy; Zhou, Dan; Bafna, Vineet et al. (2014) The genetic basis of chronic mountain sickness. Physiology (Bethesda) 29:403-12
Kramer, Michael; Dutkowski, Janusz; Yu, Michael et al. (2014) Inferring gene ontologies from pairwise similarity data. Bioinformatics 30:i34-42
Akizu, Naiara; Silhavy, Jennifer L; Rosti, Rasim Ozgur et al. (2014) Mutations in CSPP1 lead to classical Joubert syndrome. Am J Hum Genet 94:80-6
Gil-Sanz, Cristina; Landeira, Bruna; Ramos, Cynthia et al. (2014) Proliferative defects and formation of a double cortex in mice lacking Mltt4 and Cdh2 in the dorsal telencephalon. J Neurosci 34:10475-87

Showing the most recent 10 out of 18 publications