Accurate and thorough curation of all of the data generated via PENTACON is a necessary ingredient of the glue that will bind the project together;sharing of pre-publication data amongst the consortium would be nearly impossible without such robust curation efforts. In addition, all data that flow into PENTACON and can be released under compliance rules will be publicly available to the greater research community. We will distribute data both through the PENTACON database, which will include sophisticated search and analysis tools that depend on the appropriate annotation and connection between data (see Databases and Integration Core), as well as all appropriate public repositories and/or model organism databases. Thus, the impact of this large collection of well-annotated data will go far beyond the consortium itself. While curation efforts in general are expensive endeavors, they are vital to the success of an enterprise as novel and as complex as we are proposing here. To limit the costs, our approach is to have a relatively lean but permanent staff of curators who will be responsible for overall monitoring of the process and several different more specific tasks. We will take a practical approach and curate to the extent necessary to meet the minimum requirements of PENTACON researchers (sufficient that these investigators from diverse disciplines can communicate in a common language) and the standards set by public repositories for a particular data type, rather than to an exhaustive ideal. For example, early in Year 1, we will survey the data modelers in PENTACON to determine exactly what they need to utilize metabolomics data. By annotating the information that the modelers require plus any additional information that is standard for publication of metabolomics data, we will have met any needs of the vast majority of researchers who want to use the data. We expect that our curation efforts on both PENTACON-generated and external data will have a significant impact on research. We will provide a rich, well-annotated source of a wide variety of data that will be able to be accessible to the scientific community at large to connect findings together in an important and scientifically rigorous manner.
Showing the most recent 10 out of 91 publications