We will design and deploy software infrastructure that bridges diverse processes and resources in biotechnology and information science. In the process we will design and implement significant improvements in the R programming language to support the development and use of the innovative software tools we develop. In the domain of biotechnology we will design software enabling the integration of experimental data and experimental metadata (e.g. MIAME). We will develop tools to integrate biological metadata with experimental data. These tools will be designed to allow other developers to have access to all methodology and will be designed to simplify interactions with other projects through a well-defined class system. We will develop software infrastructure for visualization, combining data from multiple sources and accessing information, programmatically from the WWW. In the domain of information technology, tools for creating, structuring and harvesting annotation resources and databases of published literature will be developed. Interactive tools for linking experimental results to annotation/literature resources in real time will be provided. We will explore the development and deployment of Web services. The software architecture will address the highly dynamic nature of the annotation and scientific literature, and will cope with multiple data sources of varying degrees of quality. Indexing of techniques with respect to error rates, resolutions and capabilities will be supported. Guidance and infrastructure that will enable the use of the software tools in a graphical manner will be developed and deployed. Additionally we will use the tools described above to develop new computational methods for genomic data, with particular attention to visualization, computational inference, multiple comparisons, and specialized analytic methods for microarray data.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Exploratory/Developmental Grants Phase II (R33)
Project #
1R33HG002708-01A1
Application #
6689479
Study Section
Special Emphasis Panel (ZRG1-SSS-Y (92))
Program Officer
Good, Peter J
Project Start
2003-09-30
Project End
2006-06-30
Budget Start
2003-09-30
Budget End
2004-06-30
Support Year
1
Fiscal Year
2003
Total Cost
$770,540
Indirect Cost
Name
Dana-Farber Cancer Institute
Department
Type
DUNS #
076580745
City
Boston
State
MA
Country
United States
Zip Code
02215
Carey, Vincent J; Morgan, Martin; Falcon, Seth et al. (2007) GGtools: analysis of genetics of gene expression in bioconductor. Bioinformatics 23:522-3
Reimers, Mark; Carey, Vincent J (2006) Bioconductor: an open source framework for bioinformatics and computational biology. Methods Enzymol 411:119-34
Carey, Vincent J; Gentry, Jeff; Whalen, Elizabeth et al. (2005) Network structures and algorithms in Bioconductor. Bioinformatics 21:135-6
Durinck, Steffen; Allemeersch, Joke; Carey, Vincent J et al. (2004) Importing MAGE-ML format microarray data into BioConductor. Bioinformatics 20:3641-2
Balasubramanian, Raji; LaFramboise, Thomas; Scholtens, Denise et al. (2004) A graph-theoretic approach to testing associations between disparate sources of functional genomics data. Bioinformatics 20:3353-62