Detecting Subtle Signals in Genomic Sequence

Lawrence, Charles

Abstract

It is estimated that there are approximately 80,000 genes in the human genome (Fields C., et al. 1994). To turn this genetic blueprint into a functional organism, genes must be expressed in a specific temporal and spatial pattern. Finding signals that control this expression and understanding their language is one of the major challenges of the post- genome era. Laboratory identification of regulatory elements, modules, and regions in genomic sequences is often an arduous, time-consuming, and expensive process. If specific approaches can be developed, computational analyses promise to accelerate this process at minimal cost. The long term goal of the proposed research is to develop and apply Bayesian bioinformatics computational methods which will describe the complete wiring diagram for a genome's transcription regulation system. This description will include four components: 1) the identification of all superfamilies of transcription factors and their classification into functionally related subclasses based on both the DNA recognition motifs and the activator domains; 2) the identification and characterization of a genome's transcriptional regulatory modules and all factor binding elements within them; 3) the full delineation of the connections between factors and their binding elements; 4) a characterization of alternative transcriptional regulatory motifs, including those based on DNA composition, and DNA and RNA structure. These goals will be addressed using Bayesian statistical models and algorithms, the foundations for which we developed during the current award period. These include Gibbs sampling algorithms to assembly superfamilies of transcription factors and multiply align them, transcription factor classification algorithms, exact Bayesian algorithms for the description of compositional and structural heterogeneity, RNA secondary structure, and phylogenetic footprinting, and recursive Gibbs sampling HMM for regulatory module identification and characterization.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project (R01)
Project #: 5R01HG001257-05
Application #: 6343262
Study Section: Special Emphasis Panel (ZRG1-GNM (03))
Program Officer: Good, Peter J

Project Start: 1995-09-20
Project End: 2004-12-31
Budget Start: 2001-01-01
Budget End: 2001-12-31
Support Year: 5
Fiscal Year: 2001
Total Cost: $326,246
Indirect Cost

Institution

Name: Wadsworth Center
Department
Type
DUNS #: 110521739

City: Menands
State: NY
Country: United States
Zip Code: 12204

Related projects


NIH 2004 R01 HG	Detecting Subtle Signals in Genomic Sequence Lawrence, Charles E. / Wadsworth Center	$489
NIH 2004 R01 HG	Detecting Subtle Signals in Genomic Sequence Lawrence, Charles E. / Brown University	$353,003
NIH 2003 R01 HG	Detecting Subtle Signals in Genomic Sequence Lawrence, Charles E. / Wadsworth Center	$344,139
NIH 2002 R01 HG	Detecting Subtle Signals in Genomic Sequence Lawrence, Charles E. / Wadsworth Center	$335,060
NIH 2001 R01 HG	Detecting Subtle Signals in Genomic Sequence Lawrence, Charles E. / Wadsworth Center	$326,246
NIH 2000 R01 HG	Detecting Subtle Signals in Genomic Sequence Lawrence, Charles E. / Wadsworth Center	$338,087
NIH 1997 R01 HG	Detecting Subtle Sequence Signals in Genomic Junk Lawrence, Charles E. / Wadsworth Center
NIH 1996 R01 HG	Detecting Subtle Sequence Signals in Genomic Junk Lawrence, Charles E. / Wadsworth Center
NIH 1995 R01 HG	Detecting Subtle Sequence Signals in Genomic 'Junk' Lawrence, Charles E. / Wadsworth Center

Publications

Newberg, Lee A; Lawrence, Charles E (2009) Exact calculation of distributions on integers, with application to sequence alignment. J Comput Biol 16:1-18

Carvalho, Luis E; Lawrence, Charles E (2008) Centroid estimation in discrete high-dimensional spaces with applications in biology. Proc Natl Acad Sci U S A 105:3209-14

Webb-Robertson, Bobbie-Jo M; McCue, Lee Ann; Lawrence, Charles E (2008) Measuring global credibility with application to local sequence alignment. PLoS Comput Biol 4:e1000077

Newberg, Lee A; Thompson, William A; Conlan, Sean et al. (2007) A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction. Bioinformatics 23:1718-27

Thompson, William A; Newberg, Lee A; Conlan, Sean et al. (2007) The Gibbs Centroid Sampler. Nucleic Acids Res 35:W232-7

Ding, Ye; Chan, Chi Yu; Lawrence, Charles E (2006) Clustering of RNA secondary structures with application to messenger RNAs. J Mol Biol 359:554-71

Conlan, Sean; Lawrence, Charles; McCue, Lee Ann (2005) Rhodopseudomonas palustris regulons detected by cross-species analysis of alphaproteobacterial genomes. Appl Environ Microbiol 71:7442-52

Chan, Chi Yu; Lawrence, Charles E; Ding, Ye (2005) Structure clustering features on the Sfold Web server. Bioinformatics 21:3926-8

Thompson, William; McCue, Lee Ann; Lawrence, Charles E (2005) Using the Gibbs motif sampler to find conserved domains in DNA and protein sequences. Curr Protoc Bioinformatics Chapter 2:Unit 2.8

Newberg, Lee A; McCue, Lee Ann; Lawrence, Charles E (2005) The relative inefficiency of sequence weights approaches in determining a nucleotide position weight matrix. Stat Appl Genet Mol Biol 4:Article13

Showing the most recent 10 out of 30 publications

Comments

Be the first to comment on Charles Lawrence's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: