Technological advances in DNA sequencing are enabling new areas of genomic research allowing fine connections between genomic variants and phenotype. We are studying transcriptomes and genomes from a variety of eukaryotic and prokaryotic non model organisms and have contributed to gain biological insights. Our first analysis was on the transcriptional regulation of N-acetylglutamate synthase where we predicted involvement of Sp1, CREB, HNF-1, and NF-Y in mammalian ureagenesis. The second study led to the first characterization of the Cape gooseberry transcriptome, here we were able to use the tomato and potato genomes to predict gene models and develop near 6,000 SSR markers from assembled ESTs. A bacterial sequencing study from six Rhodanobacter strains isolated from soil revealed variable denitrification capabilities, contributing to our knowledge of the biochemical pathways in the genus. A third study involves the evaluation of DNA barcode efficacy for taxonomic identification, here we use the probability of correct identification (PCI) as the appropriate measurement of barcode efficacy. The last study involved the in silico identification and characterization of the ion transport specificity for P-type ATPases in the Mycobacterium tuberculosis complex, here we used a number of computational methods to study and classify the P-type ATPases into 3 major groups: heavy metal cation transporters, alkaline and alkaline earth metal cation transporters and the beta subunit of a prokaryotic potassium pump.

Project Start
Project End
Budget Start
Budget End
Support Year
1
Fiscal Year
2013
Total Cost
$164,370
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
Zip Code
Jjingo, Daudi; Conley, Andrew B; Wang, Jianrong et al. (2014) Mammalian-wide interspersed repeat (MIR)-derived enhancers and the regulation of human gene expression. Mob DNA 5:14
Spouge, John L; Mariño-Ramírez, Leonardo; Sheetlin, Sergey L (2014) Searching for repeats, as an example of using the generalised Ruzzo-Tompa algorithm to find optimal subsequences with gaps. Int J Bioinform Res Appl 10:384-408