Comparative sequence analyses are an essential component of contemporary genomics research. We are continuing to develop and apply new methods for detecting more complex types of evolutionary sequence constraint, such as lineage-specific constrained sequences or weakly constrained sequences, as well as other types of constraint that might not be reflected in the primary nucleotide sequence. Furthermore, we are utilizing inter-species sequence analyses of functional annotations to identify sequence signatures that might confer function;such approaches are particularly useful for detecting short sequences not evolutionarily constrained in orthologous positions across multiple species, but whose relative position to genes and other features is important. The Section is also developing high-throughput methods to experimentally detect and classify functional genomic sequences. This approach utilizes a flow-cytometry-based selection of GFP-reporter constructs harboring candidate enhancer sequences. Preliminary results are promising and have validated that known sequences with enhancer activity can be detected. Scaling this approach to larger regions of the genome will require the use of new-generation sequencing technologies. Along these lines, the Section is utilizing a next-generation sequencing technology from Illumina/Solexa. We are working both on the wet-lab side to get the machine working optimally, as well as the informatics side, developing approaches to maximize the amount of high-quality sequence data that can be extracted the new instrument. We are pursuing a number of applications, including ChIP-Seq, RNA-Seq, and other """"""""tag"""""""" based counting experiments, as well as medically-relevant whole genome sequencing projects that can now be pursued at a dramatically reduced cost. This work will not only enable the above-mentioned research projects in my Section, but also many others in the Institute looking to take advantage of this new technology for their own projects.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
National Human Genome Research Institute
Zip Code
ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489:57-74
Goldfeder, Rachel L; Parker, Stephen C J; Ajay, Subramanian S et al. (2011) A bioinformatics approach for determining sample identity from different lanes of high-throughput sequencing data. PLoS One 6:e23683
Catalona, William J; Bailey-Wilson, Joan E; Camp, Nicola J et al. (2011) National Cancer Institute Prostate Cancer Genetics Workshop. Cancer Res 71:3442-6
Pilon, Andre M; Ajay, Subramanian S; Kumar, Swathi Ashok et al. (2011) Genome-wide ChIP-Seq reveals a dramatic shift in the binding of the transcription factor erythroid Kruppel-like factor during erythrocyte differentiation. Blood 118:e139-48
Ajay, Subramanian S; Parker, Stephen C J; Abaan, Hatice Ozel et al. (2011) Accurate and comprehensive sequencing of personal genomes. Genome Res 21:1498-505
Belgard, T Grant; Marques, Ana C; Oliver, Peter L et al. (2011) A transcriptomic atlas of mouse neocortical layers. Neuron 71:605-16
Teer, Jamie K; Bonnycastle, Lori L; Chines, Peter S et al. (2010) Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing. Genome Res 20:1420-31
Ewens, Kathryn G; Stewart, Douglas R; Ankener, Wendy et al. (2010) Family-based analysis of candidate genes for polycystic ovary syndrome. J Clin Endocrinol Metab 95:2306-15
Young, Andrew L; Abaan, Hatice Ozel; Zerbino, Daniel et al. (2010) A new strategy for genome assembly using short sequence reads and reduced representation libraries. Genome Res 20:249-56
Sommer, Wolfgang H; Lidstrom, Jessica; Sun, Hui et al. (2010) Human NPY promoter variation rs16147:T>C as a moderator of prefrontal NPY gene expression and negative affect. Hum Mutat 31:E1594-608

Showing the most recent 10 out of 15 publications