Prokaryotic and eukaryotic whole genome sequence data is accumulating at an unprecedented pace. The next phase will be increasingly dominated by efforts to characterize, categorize, and analyze these data with the goal of understanding molecular sequence information and its significance in biological systems. Much current biological and medical research centers on DNA microarrays. The main focus of our research is to evaluate gene expression levels based on codon usage. Our sequence methods are complementary to the experimental procedures of 2D-gel electrophoresis in assessing gene expression levels. We have introduced a theoretical computational method for characterizing gene expression levels based on codon usage differences between gene classes. The method has been applied to a variety of genomes including fast-growing bacteria, the cyanobacterium of Synechocystis PCC6803, and the radiation resistant Deinococcus radiodurans (see Progress Report). We can predict highly expressed genes in each bacterial genome, which correlate very well with 2D-gel protein abundances. We propose to apply the methods to all complete genomes and illustrate here pilot studies for two groups of bacterial genomes: the first group consists of all available low G+C Gram-positive genomes including the pathogens Listeria monocytogenes, Staphylococcus aureus, Streptococcus pyogenes, and the nonpathogenic dairy fermentation bacterium Lactococcus lactis. The second group consists of all available high G+C a-proteobacteria. The latter genomes are important for understanding nitrogen fixation. A second aspect of our research will be to investigate the status of genes in several metabolic pathways and of several protein families among archaeal and bacterial species contrasting presence, absence, and expression levels of genes. A third major objective of our research will be to extend our codon usage methods for predicting gene expression levels to eukaryotic genomes, including yeast, D. melanogaster, C. elegans, and human. ? ?
Hao, Bingtao; Naik, Abani Kanta; Watanabe, Akiko et al. (2015) An anti-silencer- and SATB1-dependent chromatin hub regulates Rag1 and Rag2 gene expression during thymocyte development. J Exp Med 212:809-24 |
Macario, Alberto J L; Brocchieri, Luciano; Shenoy, Avinash R et al. (2006) Evolution of a protein-folding machine: genomic and evolutionary analyses reveal three lineages of the archaeal hsp70(dnaK) gene. J Mol Evol 63:74-86 |
Karlin, Samuel; Mrazek, Jan; Ma, Jiong et al. (2005) Predicted highly expressed genes in archaeal genomes. Proc Natl Acad Sci U S A 102:7303-8 |
Karlin, Samuel; Brocchieri, Luciano; Campbell, Allan et al. (2005) Genomic and proteomic comparisons between bacterial and archaeal genomes and related comparisons with the yeast and fly genomes. Proc Natl Acad Sci U S A 102:7309-14 |
Brocchieri, Luciano; Karlin, Samuel (2005) Protein length in eukaryotic and prokaryotic proteomes. Nucleic Acids Res 33:3390-400 |
Brocchieri, Luciano; Kledal, Thomas N; Karlin, Samuel et al. (2005) Predicting coding potential from genome sequence: application to betaherpesviruses infecting rats and mice. J Virol 79:7570-96 |
Karlin, Samuel; Theriot, Julie; Mrazek, Jan (2004) Comparative analysis of gene expression among low G+C gram-positive genomes. Proc Natl Acad Sci U S A 101:6182-7 |
Karlin, Samuel; Barnett, Melanie J; Campbell, Allan M et al. (2003) Predicting gene expression levels from codon biases in alpha-proteobacterial genomes. Proc Natl Acad Sci U S A 100:7313-8 |
Campbell, Allan (2003) Prophage insertion sites. Res Microbiol 154:277-82 |
Mrazek, Jan; Gaynon, Lisa H; Karlin, Samuel (2002) Frequent oligonucleotide motifs in genomes of three streptococci. Nucleic Acids Res 30:4216-21 |
Showing the most recent 10 out of 141 publications