Bioinformatics Core

Rouchka, Eric

Abstract

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Students Supported I. Elizabeth Cha (doctoral candidate), January 2005-present. Elizabeth has been involved in research concerning the clustering, alignment, and mapping of large databases of biological data such as human ESTs. As part of her research, she has looked at methods for parallelizing approaches to sequence alignment. One of her projects looked at the performance of different versions of the BLAST software, including possible enhancements. This research was accepted as a paper and presented at the HICOMB 2005 conference. In November 2005, Ms. Cha successfully presented her dissertation proposal and passed her preliminary examination on methods for detecting and classifying pseudogenes. In addition, Ms. Cha is currently responsible for the maintenance and update of the computers in the Bioinformatics Laboratory. Ravishrikanth Gundlapalli (master's student), Summer 2004; Fall 2005. Ravi spent the summer of 2004 and the fall of 2005 working on a front-end system for the design and storage of data pertinent to customized gene arrays. His system will allow researchers to design and analyze their own custom chips in a much quicker fashion than the currently implemented approach. Mr. Gundlapalli successfully defended his thesis work in December, 2005. Yamini RudraRaju (master's student), Spring 2006. Yamini will be spending the spring of 2006 on a research project in the bioinformatics laboratory. Yamini is the first student to participate in the joint EKU-UofL bioinformatics master's program, and will be supported by the EKU INBRE subaward. Her research project will involve the computational design of degenerate primers. Other Supported Staff Larissa Tiouliandina Summer 2005-present Larissa has been working closely with Dr. Eric Rouchka and Ms. Stephanie Dearing to develop a database to help track information gathered from the KBRIN-INBRE network, including grants, publications, presentations, honors and awards, and student tracking information. The database system will have several levels of security for administrators, Lead faculty, external advisory members, and individual researchers. A targeted release date has been set for late spring 2006. Bioinformatics Theses and Dissertations The following students have successfully completed master?s projects, theses or dissertations at the University of Louisville within the past year with topics related to bioinformatics. KBRIN-affiliated faculty involved in the committees are listed. Abhijit Phatak. (2005). Using relational databases to analyze microarray probes and single nucleotide polymorphisms. Eric Rouchka, advisor. Project defense date: August, 2005. Ravishrikanth Gundlapalli. Microarray Database Resource for Designing Custom Microarrays (2005). Eric Rouchka, advisor. (Nigel Cooper, committee member). Defense date: December, 2005. Molecular Modeling Resources In October 2002, KBRIN and the Department of Biochemistry & Molecular Biology established a biomolecular modeling and graphics system for teaching and research purposes. A dual-processor Silicon Graphics Octane graphics workstation (reconditioned) running Irix 6.4 was purchased and connected to the university network. The SGI Octane is equipped with a large color monitor, 1 MB memory, 100 GB hard drive, and a Crystal View 3-D system that allows viewing graphics in high-quality 3-dimensional images that can be manipulated with a mouse or cursor keys. Three-year licenses for the following software was purchased from Accelerys, Inc. Insight II 2002.1 (molecular visualization and analysis; control module for other software) Biopolymer (construction of polypeptides, carbohydrates, and nucleic acids from building blocks) Homology (builds a 3-D model of a protein from its amino acid sequence aligned to the sequence of a homologous template protein of known structure) Modeler (automatically generates 3-D homology models of proteins; similar to Homology but more versatile) Discover 1 & 2 and associated force fields (molecular dynamics programs) Decipher (analysis of complex molecular structures and output from molecular dynamics simulations) Binding Site Analysis (characterization and comparison of protein binding sites) One year later, the following modules were added (perpetual licenses) with partial support from KBRIN and Biochemistry: Profiles 3D (data base searching for structural motifs in proteins for similarity analysis) SeqFold (protein fold recognition to help identify protein function; analyzes structural similarity based on physical-chemical attributes of amino acids in the protein structure data bank, PDB) CHARMm (molecular dynamics calculations for macromolecules) Search/Compare (compares conformations of different macromolecules) In December 2006, after expiration of licenses purchased in 2002, we renewed the following modules and added some new modules. Perpetual licenses were purchased with the financial support of KBRIN, Biochemistry, the JG Brown Cancer Center, and individual researchers in Biochemistry and the Cancer Center. We now have updated licenses for the following: Insight II 2005.1 Biopolymer Modeler We also purchases perpetual licenses for: ZDockPro (protein-protein docking from PDB files) Delphi (rigorous calculation of electrostatic potentials for proteins and polynucleotides) Ludi/CAP (fits molecules into the active site of a receptor by identifying complementary polar and non-polar groups from a library of commercially available small molecules) Computer Resources The Bioinformatics Lab currently has four students working on bioinformatics full-time (two Ph. D. and two masters). Two undergraduate students are working part-time in the bioinformatics lab, and one high school student is working on a bioinformatics-related science fair project under the direction of Dr. Rouchka. Two additional masters students have been working in the bioinformatics lab during the past year. There are three main servers maintained for use by KYBRIN investigators. They include the 16-node 32-cpu kybrin Beowulf cluster computer, a dual-processor linux RedHat 9.0 server, and a dual processor Windows Server 2003. kybrin.louisville.edu The 16-node 32 cpu Beowulf cluster (kybrin.louisville.edu) is the most heavily used of the systems. The use is mainly for bioinformatics research requiring parallel computation. Currently, there are 45 users who have interactive login accounts. A 2.4 Terabyte RAID network attached storage (NAS) system is currently tied directly into the Beowulf cluster machine housed in the Dahlem supercomputing center. Currently, 374 GB of the 2.4 Terabyte storage is used. Research performed on the Beowulf cluster during the past year is described in the Research section. Upgrades to the kybrin.louisville.edu system this last year include the purchase of memory to expand the web server to 4 GB; a backup master node; and several hard drives. Each of these has been put into place to help minimize downtime of the system in case of a hardware failure. Invitrogen License Server. In addition to the research performed, the KYBRIN Beowulf cluster also acts as the University of Louisville license server for the statewide license for Invitrogen's Vector NTI and Vector XPression software. The usage over the past year: The following table and two graphs show the usage of the Informax/Invitrogen software over the past three years. It must be noted that for 2004-2005, there is a chunk of missing data, equivalent to about 3 weeks of usage. In addition, the two heaviest users were excluded, since they severely skew the data (pumping up the VectorNTI usage to over 93,000 licenses granted). This data indicates that the overall usage increased at a rate of about 150% per year through 2005, mainly due to the VectorNTI package. However, the sharp decline in the use of Expression indicates this is not a viable alternative for microarray data analysis. The sharp decline in usage in 2005-2006 can be attributed to two factors. This is a decline in the usage by the three major users of the software along with Invitrogen's September 15, 2005 announcement of an Open Access Policy for Academic and Government researchers beginning with the VectorNTI v10.0 released 9/15/2005, thus effectively eliminating the need for a dynamic license server (and therefore the ability to accurately track usage). Year Users VectorNTI AlignX BioPlot ContigExpress Expression Total Licenses 2002- 2003 76 5070 713 50 221 161 6215 2003-2004 93 7677 1171 78 634 114 9889 2004-20051 91 9848 886 26 643 4 11,350 2004-20052 91 93,469 886 26 64 4 94,971 2005-2006 66 2416 660 26 405 4 3511 Table 1: Informax/Invitrogen Usage (University of Louisville Dynamic License Server). 1This data excludes the two highest users of VectorNTI. 2This data includes all users of all software packages. Access Grid Node Server. The KYBRIN Beowulf cluster also serves as an Access Grid (AG) Node server. This server will allow for researchers across the state of Kentucky to participate in a remote discussion, using relatively little resources ? basically a headset and a web camera. Nathan Johnson and Ed Birchler of the Dalhelm Supercomputer Center at the University of Louisville have been visiting various KYBRIN sites throughout the past year to set up individual clients, and to assess the technical capabilities of each University. kbrin.a-bldg.louisville.edu This system is a dual-processor 2GHz AMD RedHat Linux based server, used for a variety of web applications and development. There are currently five interactive logins for this system, Some of the development performed on this server includes the MPrime primer design software and MySQL microarray database development. This server serves as a web server for information concerning the CECS 660-01 Introduction to Bioinformatics. In addition, this system is used for administrative purposes, including a GrantSlam server, and for web applications requiring mailed forms and database input and querying. From 6/12/2005 to 1/6/2006, there were a total of 374,272 page hits to this server. The majority of these hits are listed in Table 2. Description of Pages Number of Hits CECS 660 Introduction to Bioinformatics online Course Matherial 215,867 University of Louisville Bioinformatics Research Group (BRG) 31,853 CECS 302 Information Structures Course Material 23,975 EMBOSS 5,336 Bioinformatics Journal Club 2,721 Local Bioinformatics Research Papers 1,078 MPrime Multiple Primer Design 668 Table 2: Summary of majority of web hits for kbrin.a-bldg.louisville.edu. medschsrv.spd.louisville.edu This system is a dual-processor Windows Server 2003 system. There are two primary uses of this system: as a windows-based license server (including S+ 6.1 and ArrayAnalyzer 2.0) and as a database development tool using MS-SQL. Conferences and Meetings Fifth Virtual Conference on Genomics and Bioinformatics (10/25/2005-10/28/2005) For the fourth consecutive year, the access grid node at the University of Louisville broadcast the Virtual Bioinformatics Conference, with around ten students and faculty tuning in at various times. This year, Dr. Eric Rouchka gave a presentation on ?Statewide bioinformatics in Kentucky? The plan is to broadcast it again next year. Kentucky Academy of Sciences (11/10/2005-11/12/2005) This year, KAS once again sponsored a session on Computer and Information Sciences, with the emphasis on Bioinformatics applications. A total of ten oral presentations and four posters were presented. Three of the four undergraduate winners for oral presentations were advised by KBRIN sponsored faculty, as were both of the graduate winners. The president and secretary for this session for 2006 were re-elected, and both (Chuck Staben, University of Kentucky; Eric Rouchka, University of Louisville) are KYBRIN affiliated researchers. Place Student School KBRIN Advisor 1st Undergrad Eren Turgay Univ. of Kentucky Jaromczyk 2nd Undergrad Wesley Asher Eastern Kentucky Bautista 3rd Undergrad Nathan Gilbert Morehead State N/A 4th Undergrad Lauren Stein Eastern Kentucky Bautista 1st Undergrad Poster Brianna Paisley Eastern Kentucky Bautista 1st Grad Josh Gilkerson Univ. of Kentucky Jaromczyk 2nd Grad Ravi Gundlapalli Univ. of Louisville Rouchka; Cooper Table 3: KAS Computer and Information Science Section Winners. The opening symposium for this year's Kentucky Academy of Sciences (KAS) was based on state-wide bioinformatics, entitled ?Genomics & Bioinformatics: Research, Resources & Programs in Kentucky.? This symposium included a panel discussion involving five KBRIN-INBRE researchers: Chuck Staben (University of Kentucky), Eric Rouchka (University of Louisville), Claire Reinhart (Western Kentucky University), Pat Calie (Eastern Kentucky University), and Mark Bardgett (Northern Kentucky University). KAS held for the first time a session on Computer and Information Sciences, with the emphasis on Bioinformatics applications. The president and secretary for this session for 2005 were elected, and both (Chuck Staben, University of Kentucky; Eric Rouchka, University of Louisville) are KYBRIN affiliated researchers. During April 1-3, 2005, researchers from across the state of Tennessee and Kentucky will meet at Lake Barkley State Park in Cadiz, Kentucky for a summit on bioinformatics. In addition to 15 talks and a poster session by researchers in Kentucky and Tennessee, there were a number of well-respected invited speakers, including: Chip Lawerence (Brown University), Martin Tompa (University of Washington), Paul Thomas (Applied Biosystems), Dan Gusfield (UC Davis), Seth Grant (Wellcome Trust Sanger Institute), Carey Phillips, and Jean Thierry-Mieg (NIH). Over 160 people attended the summit, with more than 80 coming from Kentucky, with the majority having direct or indirect affiliation with KYBRIN. This conference is being sponsored in part by KYBRIN. Four members of the organizing committee, Nigel Cooper (UofL), Eric Rouchka (UofL), Stephanie Dearing (UofL), and Chuck Staben (UK) are directly affiliated with KYBRIN, while a fifth, Daniel Goldowitz (UT Memphis) is on the external advisory board. UT-ORNL-KBRIN Bioinformatics Summit 2006 (4/21/2006-4/23/2006) During April 21-23, 2006, researchers from across Tennessee and Kentucky will once again gather in Lake Barkley State Park in Cadiz, Kentucky for the Bioinformatics Summit. This year?s speakers include Hamid Bolouri (Institute for Systems Biology), David Galas (Batelle), Isaac Kohane (Harvard), Dan Masys (Vanderbilt), David Schwartz (NIEHS), Robert Kavlock (EPA), Wyeth Wasserman (Univ. of British Columbia), and Jack Keene (Duke). There will be several workshops on topics such as bioinformatics education, proteomic analysis, microarray analysis, and imaging atlases as well as sessions on Genes and the Environment, The Regulome, and Clinical Bioinformatics. Research Bioinformatics Research Group (BRG) The Bioinformatics Research Group (BRG) has been active since mid-2001. The BRG brings together researchers from across the University of Louisville's campus in order to promote collaboration between researchers in the School of Public Health, Health Sciences, and Computer Engineering and Computer Science. The BRG holds meetings twice a month, once at the Speed School of Engineering, and once at the Health Sciences Campus. The main focus of the BRG is on the development of databases, applications, and meta-analysis of microarrays. The BRG website is being hosted at: http://kbrin.a-bldg.louisville.edu/brg/ Research Projects The following research projects have been active within the past year. In addition, these projects have used computing resources made available through the KBRIN-INBRE grant. Detection of Tandem Repeats in the Zebrafish Genome (E.C. Rouchka, University of Louisville) The zebrafish geneome is highly polymorphic, with single nucleotide polymorphisms (SNPs) occurring at an average of once every 200 bases. However, a high density genetic map of the genome does not currently exist. Due to the availability of a rough draft assembly of the zebrafish genome and the underlying trace data, it is now possible to construct such a map by looking at regions that are more susceptible to polymorphisms. One such region is tandemly repeated elements, which have been known to cause disease (such as Fragile X, Myotonic Dystrophy and Huntington's Disease in humans) due to variations in the copy number. Computational detection of these tandemly repeated regions and design of primer pairs to amplify these regions can help molecular biologists determine which of these regions have observable polymorphisms in a population, and can therefore be used as genetic markers. A total of 19,845 regions containing a repeat of length greater than four with a copy number greater than 10, and a total length of 250 base pairs or less have been detected. Currently, we are working with a molecular biology lab in determining whether or not a subset of the pentamer repeats indicates polymorphisms within the population. MiDaR: Microarray Database Resource (EC Rouchka, J Hornsby, R Gundlapalli, R Jones, D-J Chang, S El-Hadik, A Desoky, A Elmaghraby, NGF Cooper, University of Louisville) Microarray technology has the potential to yield a vast amount of gene expression data. MiDaR is multifaceted project to explore the implementation of a database management and analysis resource for both custom microarrays (both cDNA and oligo based) and commercial microarrays (including Affymetrix and Agilent). The various aspects of MiDaR include: Custom Chip Design (Gundlapalli, Rouchka and Cooper) This part of the project deals with the physical layout of customized microarrays, allowing the user to determine which genes/gene groups to place on a given chip, as well as designing the oligos and/or primer products to represent each gene. Each of these layouts can be stored in a database and edited at a later time. Management of Microarray Data (Hornsby, Jones, Rouchka and Chang) Object-relational database schemas for the management of microarray data from different sources is considered in the portion of MiDaR. Data from different sources will be able to be interchanged for analysis purposes. Relation of Microarray Data to Biological Information (Jones, Rouchka) MiDaR will also allow the user to associate microarray information with other relevant genetic information, including related gene and protein structures, homologous sequences, and genetic regulatory pathways, both known and unknown. This portion of MiDaR will potentially be tied into other sources of data as well, including proteomics and metabolomics. Data Analysis (Rouchka, El-Hadik, Desoky, Elmaghraby) The analysis of the differential expression of genes relative to a given cell type, stimulus, or time point is the most desired information for a biologist. Portions of MiDaR are concerned with taking the data stored within the data management system, analyzing it, and placing the results of the experimental comparison back in the database. The analysis are performed using customized routines connected with R, Bioconductor, and Matlab bioinformatics. Multiple Primer Design (EC Rouchka; A Khalyfa; NGF Cooper, University of Louisville) Motivation: Enhancements in sequencing technology have recently yielded assemblies of large genomes including rat, mouse and human. The availability of large-scale genomic and genic sequence data coupled with advances in microarray technology have made it possible to study the expression of large numbers of sequence products under several different conditions in days where traditional molecular biology techniques might have taken months, or even years. Therefore, to efficiently study a number of gene products associated with a disease, pathway, or other biological process, it is necessary to be able to design primers en masse rather than using a time consuming and laborious gene-by-gene method. As a result, we have developed an integrated system, MPrime, in order to efficiently calculate primer pairs for multiple genic regions based on a keyword, gene name, or accession number within the rat, mouse, and human genomes. Results: A set of products created for mouse housekeeping genes from MPrime-designed primers has been validated using both PCR-amplification and DNA sequencing. These results indicate MPrime accurately incorporates standard PCR primer design characteristics to produce high scoring primer pairs within the mouse, rat, and human genomes. EST Clustering and Analysis (IE Cha, EC Rouchka, DJ Chang, University of Louisville) The availability of comparative genome sequence data as well as expressed sequence tag data (ESTs) has made it possible to study gene structure within humans and comparative genomes, such as mouse and rat. In this project, we are interested in exploring efficient methods for mapping EST sequences to their genomic counterparts for the purpose of clustering and comparing gene structure and alternative splicing in multiple organisms. We are currently exploring the use of cluster and grid technologies for this purpose. Motif Detection Using Heuristic Algorithms (CT Hardin, EC Rouchka, University of Louisville) Conserved patterns (motifs) are useful for both classification purposes and for understanding relations between sequence, structure and function of proteins. We are working to identify new heuristic-based algorithms to identify these motifs in unaligned protein or nucleotide sequences. The goals of the research are two fold: 1.) find patterns that are not discovered using other methods and 2.) improve the computational efficiency of existing algorithms. Student research. (Dr. Robert Gray and Mary Constantino, University of Louisville) Mary Constantino made a homology model of a BTB binding domain from Drosophila based on a similar domain from humans. She was able to rationalize site directed mutation experiment based on the model in her Ph.D. thesis. Faculty research. (Joint efforts with Robert Gray, University of Louisville) Dr. Gray has done homology modeling for Dr. Ramos and protein-protein docking and graphics for Dr. Wang. In addition, a few other faculty members have indicated an interest in homology modeling once the licenses have been renewed. Educational Opportunities Course Related Work CECS 660 Introduction to Bioinformatics. During 2005-2006, the Introduction to Bioinformatics Course, CECS 660-01, had nine students were pre-enrolled for the course, including students from the Health Sciences Campus, Speed School of Engineering, and the Business School. CECS 660-01 will be continued to be offered every spring, while the bioinformatics journal club will likely be offered more on an interest basis. For the spring of 2006, there are eight students pre-enrolled, including two students from the biostatistics department, one from the medical school, one visiting student from Eastern Kentucky University, and four from the Speed School of Engineering. Biochem 670 (Protein Structure and Function). Eighteen graduate students have taken the course in the last 3 years. Sixteen were from BMB, 1 from Biology and 1 from Microbiology/Immunology. A 2-hr lecture on molecular modeling was added to the course, focusing mostly on homology modeling and sequence alignment. Most of the students spent ~1 hour running a standard tutorial on the Modeler software where they constructed a model of a serine protease based on trypsin and chymotrypsin. Joint Master's Program in Bioinformatics. The joint Master's program between the University of Louisville and Eastern Kentucky University (EKU) is progressing, with the first student, Yamini Rudraraju, spending the spring of 2006 working on a research project in the KBRIN bioinformatics lab. More students are in the pipeline, and there will be on average 1-3 students per year. Training Dr. Gray (with support from BMB) has attended three 2-day training work-shops at the Accelrys headquarters north of San Diego. These workshops are: Introduction to Life Science Modeling with Insight II, Homology-Based Protein Design, and CHARMm: Molecular Mechanics and Molecular Dynamics. These workshops included extensive lectures and hands-on, guided tutorials. Included were detailed workbooks with many exercises and literature references as well as scripts for running complex modeling projects. These workbooks are available for anyone at UofL to work through.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Research Resources (NCRR)
Type: Exploratory Grants (P20)
Project #: 5P20RR016481-06
Application #: 7381767
Study Section: Special Emphasis Panel (ZRR1-RI-7 (01))

Project Start: 2006-05-01
Project End: 2007-04-30
Budget Start: 2006-05-01
Budget End: 2007-04-30
Support Year: 6
Fiscal Year: 2006
Total Cost: $169,743
Indirect Cost

Institution

Name: University of Louisville
Department: Anatomy/Cell Biology
Type: Schools of Medicine
DUNS #: 057588857

City: Louisville
State: KY
Country: United States
Zip Code: 40292

Related projects

Publications

Stenslik, M J; Evans, A; Pomerleau, F et al. (2018) Methodology and effects of repeated intranasal delivery of DNSP-11 in awake Rhesus macaques. J Neurosci Methods 303:30-40

Green, Kimberly A; Becker, Yvonne; Fitzsimons, Helen L et al. (2016) An Epichloë festucae homologue of MOB3, a component of the STRIPAK complex, is required for the establishment of a mutualistic symbiotic interaction with Lolium perenne. Mol Plant Pathol 17:1480-1492

Rouchka, Eric C; Flight, Robert M; Fasciotto, Brigitte H et al. (2016) Transcriptional profile of immediate response to ionizing radiation exposure. Genom Data 7:82-5

Saikkonen, Kari; Young, Carolyn A; Helander, Marjo et al. (2016) Endophytic Epichloë species and their grass hosts: from evolution to applications. Plant Mol Biol 90:665-75

Smith, Michael E; Monroe, J David (2016) Causes and Consequences of Sensory Hair Cell Damage and Recovery in Fishes. Adv Exp Med Biol 877:393-417

Witkowski, Travis A; Grice, Alison N; Stinnett, DeAnna B et al. (2016) UmuDAb: An Error-Prone Polymerase Accessory Homolog Whose N-Terminal Domain Is Required for Repression of DNA Damage Inducible Gene Expression in Acinetobacter baylyi. PLoS One 11:e0152013

Hofmann, Emily; Webster, Jonathan; Do, Thuy et al. (2016) Hydroxylated chalcones with dual properties: Xanthine oxidase inhibitors and radical scavengers. Bioorg Med Chem 24:578-87

Harrison, Benjamin J; Venkat, Gayathri; Lamb, James L et al. (2016) The Adaptor Protein CD2AP Is a Coordinator of Neurotrophin Signaling-Mediated Axon Arbor Plasticity. J Neurosci 36:4259-75

Rau, Kristofer K; Hill, Caitlin E; Harrison, Benjamin J et al. (2016) Cutaneous tissue damage induces long-lasting nociceptive sensitization and regulation of cellular stress- and nerve injury-associated genes in sensory neurons. Exp Neurol 283:413-27

Gemmell, Amber P; Marcus, Jeffrey M (2015) A tale of two haplotype groups: Evaluating the New World Junonia ring species hypothesis using the distribution of divergent COI haplotypes. Syst Entomol 40:532-546

Showing the most recent 10 out of 244 publications

Comments

Be the first to comment on Eric Rouchka's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: