The human genome project, officially begun over a decade ago, has continued to yield superb technologies and vast datasets applicable to a wide diversity of scientific interests. While the technology to acquire vast amounts of data is now well established and continues to expand, the ability to deal with such data, from the process of acquisition, storage, and analysis depends fundamentally on a solid informatics infrastructure as an essential component. Indeed, most of the major gains in productivity in this field are to be realized on the informatics front, in automating data acquisition, defining and sorting data in databases for quality control and analysis, facilitating access to data, and for the large variety of analyses that such data will be subjected. Over the past 2 years we have been involved in establishing an informatics infrastructure in a small genetics laboratory focused on resequencing human immune response genes. From this experience we have built a genetics management system (GEMS) that is applicable in a general way to the management of data acquired from automatic DNA sequencers. This resource includes support for three types of data acquisition: 1) PCR resequencing which can be used for SNP discovery and genotyping; 2) shotgun sequencing projects, applicable to primary data acquisition as well as resequencing of target regions from cohorts in disease studies; and 3) genotyping methods that use capillary-based data acquisition (e.g. micro satellite, SSP-SNP, HLA typing). To this end, we propose to accomplish the following specific aims: 1) To develop a software package that improves the throughput and productivity of small to medium sized sequence-based genotyping and sequencing labs. A prototype of the final package has already gone through two complete code revisions, and is being used to design DNA sequencing trials, organize the sequencing process, gather data, and assess the quality of the sequencing output; and 2) To work in parallel with existing collaborations aimed at enhancing the ability for diverse genetics data producers to share data. We have identified six laboratories that produce genetic sequence data that have agreed to collaborate with us in testing the existing software. From this experience, we will learn and incorporate new functionality into the software with the aim of producing a package that can effectively support the activities of all six laboratories. ? ?

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Small Business Technology Transfer (STTR) Grants - Phase I (R41)
Project #
1R41RR018669-01A1
Application #
6789004
Study Section
Special Emphasis Panel (ZRG1-SSS-Y (10))
Program Officer
Filart, Rosemarie
Project Start
2004-07-01
Project End
2005-12-31
Budget Start
2004-07-01
Budget End
2005-12-31
Support Year
1
Fiscal Year
2004
Total Cost
$100,000
Indirect Cost
Name
Immunogenomics, Inc.
Department
Type
DUNS #
City
Seattle
State
WA
Country
United States
Zip Code
98175