Cobre: Uid: Bioinformatics Core Facility

Foster, James

Abstract

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. Primary support for the subproject and the subproject's principal investigator may have been provided by other sources, including other NIH sources. The Total Cost listed for the subproject likely represents the estimated amount of Center infrastructure utilized by the subproject, not direct funding provided by the NCRR grant to the subproject or subproject staff. The IBEST Bioinformatics Core currently comprises several compute clusters, stand-alone application servers, data storage systems, software, and personnel. Our primary production cluster is made up of Dell M1000E enclosures and M605 blades. It has a total of 512 cores (AMD64) and 512GB of total system memory (1GB per core). In addition to our primary cluster we maintain a 96 processor Intel Xeon based system with 48GB total system memory (512MB per processor) and a 96 processor PowerPC G5 based system with 192GB total system memory (2GB per processor). We also maintain a cluster primarily used for testing and development, which is made up of 44 Intel Xeon processors and 22GB of system memory (512MB per processor). In addition to the research clusters we have a small cluster (datarig) made up of Dell PE2950 and PE1950 servers which dedicated to the post-processing of 454 sequencing data. It is a 40 core Intel Xeon cluster with 96GB of total system memory. The clusters are currently networked with 1Gb/s TCP interconnects. The stand-alone application servers include 3 Dell M905's each with 16 cores and 32GB of system memory, 2 Dell PE6950's each with 8 cores and 8GB of system memory and 2 dual processor Sun SPARC V440's. We support over 85TB of total data storage and backup. Our LTO-4 tape backup system is capable of backing up 20TB of data. Our main production cluster has 30TB of dedicated storage, the 454 datarig has 15TB of dedicated storage, and the remaining production and development clusters split the remaining 20TB of data storage. All user data is backed up regularly. The core systems are located on the University of Idaho campus in a 1400 square foot room that has been specifically designed and renovated by UI for this Core. 1GB fiber and copper connect all equipment, and the UI backbone provides 4GB/s transfer rates. This room has a dedicated 80KVa UPS with three phase power and four forced air handlers attached to redundant university chilled water systems. The facility has an emergency backup diesel generator. The bioinformatics core is connected to the university backbone with 1Gb/s fiber and provides 1Gb/s networking to the faculty offices and laboratories. Also, the University of Idaho, funded in part through the $10M NIH Lariat infrastructure grant, has expanded off campus data transmission capacity to 2.8Gb/s in the short term, and will expand to 10Gb/s within 3 years. This will enable large, high-speed data transfer with the rest of the world, rather than just within the university. This is important for both collaborations and for systems support, since keeping our many huge databases up to date requires constant transmission of vast amounts of data from primary database providers such as NCBI. Software A wide array of software is available for general sequence analysis, phylogenetic and population genetics analyses, protein structure modeling, expression array analysis, statistics and mathematical modeling. The software available on these computers include: General Sequence Analysis Packages (EMBOSS, etc.), Database Access (PDB, SCOP, GenBank, etc.), Phylogenetic Inference (PHYLIP, PAUP*, MrBayes, fastDNAml, GeneTree, MODELTEST, P4, PAML, Seq-Gen, TreeView), Population Genetics (Migrate, Fluctuate, Recombine, Lamarc, GeneConv), Sequence Alignment (HMMER, ClustalW, mafft, muscle, etc.), Sequence Assembly (Phred/Phrap/Consed, RepeatMasker), Protein Structure Visualization (Amber, Charmm, Cn3D, Rasmol, 3D Molecular Viewer), and Statistical/Mathematical Packages (Mathematica, MatLab, R, S3 Stochastic Spatial). Most of these programs are free for academic use, while others are commercial packages or have been developed by COBRE students and personnel. The latter includes new software (EVALYN) for multiple sequence alignments, a fast program (ClearCut) for inferring phylogenetic trees that is based on a modified neighbor joining method, a program for high throughput analysis of ribosomal RNA gene sequences (HiTSA), and a companion program (StatGen) that summarizes and graphically displays the results from HiTSA, and the Microbial Community Analysis (MiCA) for analyzing TRFLP data about bacterial communities. In addition, tools to facilitate data analysis have been developed including an """"""""all-against-all"""""""" BLAST, a tool for detecting transposable elements in genomes that uses RepeatMasker, as well as tools for distributed PAUP and bootstrap analysis. Each of these software and data analysis tools is freely available to researchers anywhere.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Research Resources (NCRR)
Type: Exploratory Grants (P20)
Project #: 5P20RR016448-09
Application #: 8359574
Study Section: National Center for Research Resources Initial Review Group (RIRG)

Project Start: 2011-02-01
Project End: 2012-01-31
Budget Start: 2011-02-01
Budget End: 2012-01-31
Support Year: 9
Fiscal Year: 2011
Total Cost: $123,953
Indirect Cost

Institution

Name: University of Idaho
Department: Biology
Type: Schools of Arts and Sciences
DUNS #: 075746271

City: Moscow
State: ID
Country: United States
Zip Code: 83844

Related projects

Publications

Ruffley, Megan; Smith, Megan L; Espíndola, Anahí et al. (2018) Combining allele frequency and tree-based approaches improves phylogeographic inference from natural history collections. Mol Ecol 27:1012-1024

Chernikova, Diana A; Madan, Juliette C; Housman, Molly L et al. (2018) The premature infant gut microbiome during the first 6 weeks of life differs based on gestational maturity at birth. Pediatr Res 84:71-79

Smith, Stephanie A; Benardini 3rd, James N; Anderl, David et al. (2017) Identification and Characterization of Early Mission Phase Microorganisms Residing on the Mars Science Laboratory and Assessment of Their Potential to Survive Mars-like Conditions. Astrobiology 17:253-265

Marx, Hannah E; Dentant, Cédric; Renaud, Julien et al. (2017) Riders in the sky (islands): using a mega-phylogenetic approach to understand plant species distribution and coexistence at the altitudinal limits of angiosperm plant life. J Biogeogr 44:2618-2630

Yano, Hirokazu; Wegrzyn, Katarznya; Loftie-Eaton, Wesley et al. (2016) Evolved plasmid-host interactions reduce plasmid interference cost. Mol Microbiol 101:743-56

Sarver, Brice A J; Demboski, John R; Good, Jeffrey M et al. (2016) Comparative Phylogenomic Assessment of Mitochondrial Introgression among Several Species of Chipmunks (TAMIAS). Genome Biol Evol :

Stockmann, Chris; Ampofo, Krow; Pavia, Andrew T et al. (2016) Clinical and Epidemiological Evidence of the Red Queen Hypothesis in Pneumococcal Serotype Dynamics. Clin Infect Dis 63:619-626

Loftie-Eaton, Wesley; Yano, Hirokazu; Burleigh, Stephen et al. (2016) Evolutionary Paths That Expand Plasmid Host-Range: Implications for Spread of Antibiotic Resistance. Mol Biol Evol 33:885-97

Uribe-Convers, Simon; Settles, Matthew L; Tank, David C (2016) A Phylogenomic Approach Based on PCR Target Enrichment and High Throughput Sequencing: Resolving the Diversity within the South American Species of Bartsia L. (Orobanchaceae). PLoS One 11:e0148203

Chernikova, Diana A; Koestler, Devin C; Hoen, Anne Gatewood et al. (2016) Fetal exposures and perinatal influences on the stool microbiota of premature infants. J Matern Fetal Neonatal Med 29:99-105

Showing the most recent 10 out of 196 publications

Comments

Be the first to comment on James Foster's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: