The Analysis of Signal Elements in Promoter Sequences

Spouge, John

Abstract

The signal elements in promoter sequences are not well characterized. We propose to develop statistical tests to find nucleotide words (generally of length 8) that appear localized relative to TSSs (transcription start site). These words will then form """"""""seeds"""""""" for expansion to develop PSSMs (position-specific scoring matrices) characterizing systems of co-regulated genes. To this end, Dr. Marino-Ramirez has collected a database of about 4700 sequences around the TSS of human genes. The database is exceptionally well characterized, and ideal for our statistical study. We are using two statistics, the well respected KS (Kolmogorov-Smirnov) statistic, and a less known Poisson scan statistic, to determine whether occurrences of a given 8-letter DNA word are clustered unusually relative to the TSS. The KS statistic provides a standard for sensitivity, but it does not assign the statistical significance to a particular set of word occurrences. The Poisson scan statistic does, however, and we plan to use its list of significant occurrences as our """"""""seeds"""""""" for developing our PSSMs.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Intramural Research (Z01)
Project #: 1Z01LM091704-01
Application #: 6988472
Study Section: (CBB)

Project Start
Project End
Budget Start
Budget End
Support Year: 1
Fiscal Year: 2004
Total Cost
Indirect Cost

Institution

Name: National Library of Medicine
Department
Type
DUNS #

City
State
Country: United States
Zip Code

Related projects


NIH 2008 Z01 LM	The Analysis of Signal Elements in Promoter Sequences. Spouge, John L. / National Library of Medicine	$258,645
NIH 2007 Z01 LM	The Analysis of Signal Elements in Promoter Sequences. Spouge, John L. / National Library of Medicine	$264,808
NIH 2006 Z01 LM	The Analysis of Signal Elements in Promoter Sequences. Spouge, John L. / National Library of Medicine
NIH 2005 Z01 LM	Analysis of Signal Elements in Promoter Sequence Spouge, John L. / National Library of Medicine
NIH 2004 Z01 LM	The Analysis of Signal Elements in Promoter Sequences Spouge, John L. / National Library of Medicine

Publications

Tharakaraman, Kannan; Marino-Ramirez, Leonardo; Sheetlin, Sergey L et al. (2006) Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements. BMC Bioinformatics 7:408

Kim, Nak-Kyeong; Tharakaraman, Kannan; Spouge, John L (2006) Adding sequence context to a Markov background model improves the identification of regulatory elements. Bioinformatics 22:2870-5

Tharakaraman, Kannan; Marino-Ramirez, Leonardo; Sheetlin, Sergey et al. (2005) Alignments anchored on genomic landmarks can aid in the identification of regulatory elements. Bioinformatics 21 Suppl 1:i440-8

Marino-Ramirez, Leonardo; Spouge, John L; Kanga, Gavin C et al. (2004) Statistical analysis of over-represented words in human promoter sequences. Nucleic Acids Res 32:949-58

Frith, Martin C; Hansen, Ulla; Spouge, John L et al. (2004) Finding functional sequence elements by multiple local alignment. Nucleic Acids Res 32:189-200

Comments

Be the first to comment on John Spouge's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: