DNA and protein sequences digitally store information about biological function in a complex code that is not yet fully understood. The fundamental unit of this code is the sequence motif, which is defined as a small, recurring DNA or protein sequence pattern. A DNA motif might be involved, for example, in turning on or off the transcription of a gene in response to environmental cues. A protein motif might encode the properties of the binding site that allows the protein to carry out its function. The MEME Suite of motif-based sequence analysis software builds statistical models of DNA and protein motifs, allowing biologists to discover novel motifs, to search for new instances of known motifs, and to compare motifs to one another. This proposal continues to develop and maintain the MEME Suite, which is in regular use by biologists around the world.
The aims of this work are five-fold: (1) to increase the accessibility, usability and interoperability of the MEME Suite, (2) to expand the MEME Suite to handle epigenetic data regarding histone modifications, methylation, nucleosome positioning and DNaseI hypersensitive sites, (3) to integrate a variety of existing motif-based software tools into the MEME Suite, (4) to augment the algorithms used by the MEME Suite with proven enhancements, and (5) to continue to improve our user support services.

Public Health Relevance

This project will improve existing, widely used software that enables biologists to understand how DNA and protein sequences encode information about biological function. Identifying and accurately characterizing functional sequence motifs allows scientists to understand how genes are turned on and off and how proteins carry out their functions in the cell. Such knowledge is fundamental to any model of the basic molecular mechanisms of the cell, and in particular, for molecular-scale models of disease processes.

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Research Project (R01)
Project #
5R01RR021692-07
Application #
8129528
Study Section
Special Emphasis Panel (ZRG1-BST-Q (01))
Program Officer
Swain, Amy L
Project Start
2009-09-28
Project End
2013-08-31
Budget Start
2011-09-01
Budget End
2012-08-31
Support Year
7
Fiscal Year
2011
Total Cost
$325,929
Indirect Cost
Name
University of Washington
Department
Genetics
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Larney, Christian; Bailey, Timothy L; Koopman, Peter (2015) Conservation analysis of sequences flanking the testis-determining gene Sry in 17 mammalian species. BMC Dev Biol 15:34
O'Connor, Timothy R; Bailey, Timothy L (2014) Creating and validating cis-regulatory maps of tissue-specific gene expression regulation. Nucleic Acids Res 42:11000-10
Ma, Wenxiu; Noble, William S; Bailey, Timothy L (2014) Motif-based analysis of large nucleotide data sets using MEME-ChIP. Nat Protoc 9:1428-50
O'Brien, Aidan; Bailey, Timothy L (2014) GT-Scan: identifying unique genomic targets. Bioinformatics 30:2673-5
Lajoie, Mathieu; Hsu, Yu-Chih; Gronostajski, Richard M et al. (2014) An overlapping set of genes is regulated by both NFIB and the glucocorticoid receptor during lung maturation. BMC Genomics 15:231
Thandapani, Palaniraja; O'Connor, Timothy R; Bailey, Timothy L et al. (2013) Defining the RGG/RG motif. Mol Cell 50:613-23
Buske, Fabian A; Bauer, Denis C; Mattick, John S et al. (2013) Triplex-Inspector: an analysis tool for triplex-mediated targeting of genomic loci. Bioinformatics 29:1895-7
Bailey, Timothy; Krajewski, Pawel; Ladunga, Istvan et al. (2013) Practical guidelines for the comprehensive analysis of ChIP-seq data. PLoS Comput Biol 9:e1003326
Peterson, Kevin A; Nishi, Yuichi; Ma, Wenxiu et al. (2012) Neural-specific Sox2 input and differential Gli-binding affinity provide context and positional information in Shh-directed neural patterning. Genes Dev 26:2802-16
McLeay, Robert C; Lesluyes, Tom; Cuellar Partida, Gabriel et al. (2012) Genome-wide in silico prediction of gene expression. Bioinformatics 28:2789-96

Showing the most recent 10 out of 30 publications