The DNA databases have been a paradigm shift in science, where the end result of large-scale data generation is not so much publications, but rather a public web-based resource (genome browsers) with integrated component databases having high accuracy and sensitivity. The genome browsers are rapidly becoming one of the major tools of the bio-medical research community, and are dramatically expediting scientific discovery in genetics and biology in general. The DNA databases are relatively """"""""static"""""""" in the sense that there are relatively few variables to be considered; e.g. """"""""time"""""""", and """"""""amplitude"""""""" are generally not variables. Genome-wide information on RNA and protein are much more dynamic, with many more variables, most of which are dynamic in nature. Current efforts at providing public access databases in RNA expression profile data have been largely limited to """"""""data repositories"""""""", with only limited query tools and cross-experiment investigations possible. A major limitation of development of more powerful expression profile data warehouses and analytical tools has been the lack of a standard experimental platform, and considerable difficulty in signal/noise interpretation. We have recently circumvented many of these limitations, and have made public the first web-Oracle integrated expression-profiling database. We have developed strict QC/SOP protocols using the data-intensive and accurate Affymetrix platform, have developed a web-queried data warehouse with some analytical tools, and have written and implemented automated data conversion protocols for web publication to both our web server and NCBI GEO. Indeed, about 50% of all expression profile data in the popular UCSC genome browsers is from our integrated internal/web LIMS Oracle data warehouse (5,350 scans currently stored and analyzed). Here, our R21 request is to develop innovative web-based analytical tools that allow the casual user to query their gene of interest in any or all of the arrays in our data warehouse (human, mouse, rat; >140 projects, with many time series data). The major benchmark of our R21 phase will be the demonstration of widespread popularity of this innovative tool, by quantitation of data queries and user feedback. The R33 phase will develop, implementation of remote data entry and QC checks by other core facilities, expansion of the warehouse to > 10,000 vertebrate profiles, and development of a suite of innovative user analysis and visualization tools for the web data warehouse.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21HG002946-01
Application #
6688130
Study Section
Special Emphasis Panel (ZRG1-SSS-Y (92))
Program Officer
Good, Peter J
Project Start
2003-09-30
Project End
2004-09-29
Budget Start
2003-09-30
Budget End
2004-09-29
Support Year
1
Fiscal Year
2003
Total Cost
$164,000
Indirect Cost
Name
Children's Research Institute
Department
Type
DUNS #
143983562
City
Washington
State
DC
Country
United States
Zip Code
20010
Van Deveire, Katherine N; Scranton, Sarah K; Kostek, Mathew A et al. (2012) Variants of the ankyrin repeat domain 6 gene (ANKRD6) and muscle and physical activity phenotypes among European-derived American adults. J Strength Cond Res 26:1740-8
Bakay, Marina; Wang, Zuyi; Melcon, Gisela et al. (2006) Nuclear envelope dystrophies show a transcriptional fingerprint suggesting disruption of Rb-MyoD pathways in muscle regeneration. Brain 129:996-1013