? Core C: Bioinformatics Bioinformatics is the application of statistics and computer science to the field of molecular biology. It has emerged as a field unto itself, as the datasets that are generated by modern biomedical researchers easily exceeds what can be directly visualized. The vast amount of data increases the chance of false-negative and false-positive results, and argue for robust statistical models and reproducible workflows. Core C will work with the data generated from massive parallel sequencing from human, frog and mouse in Project I, II and III and Core B to extract variants that have potential to cause meningomyelocele or influence neural tube phenotypes. The PIs of the Projects and Cores have worked together extensively in the past, and have an established track record of productivity in the area of next generation sequencing (NGS) data analysis. Dr. Bafna has worked broadly in bioinformatics and genomics in the development computational methodologies employing novel algorithms and statistical techniques for NGS datasets. We envision that the DNA sequencing derived from Project I in the form of whole genome or whole exome sequencing from patients and their parents will be delivered to Core C for determination of potentially pathogenic risk-associated variant prioritization. RNA sequencing, single cell sequencing and epigenetic sequencing data generated from Core B, as well as imported from Project I, II and III, will be delivered to Core C for extraction of expression changes, which will be delivered to each of the Projects for segregation analysis and further validation. The Bioinformatics Core will provide these analysis pipelines to identify and annotate variants, and to develop innovative network analyses, RNAseq, Methylseq and single cell analysis to discover novel genetic mechanisms of MM based on Protein-Protein Interaction (PPI) and gene co-expression networks, to interpret large datasets from current genetic and genomic technologies, and to apply these in the different components of this Program Project. Although our primary goal is to provide service using existing computational methods, we expect that the Core B will also develop novel computational methods as required by the Projects and Cores, as we have done to develop our current WGS analysis pipeline. Methods development will be geared towards fundamental unsolved problems underlying the above four key functions, such as algorithms for correlating variants to phenotypes, further improvements in methods for computing epistatic interactions, detection of short tandem repeats and mobile elements from WGS, advanced methods for integration of genotypes with pathways, use of next- generation sequencing (NGS) in analysis of gene association, and discovery of genetic variants that influence protein expression or function.