While standards in reporting of scientific methods are absolutely critical to producing reproducible science, meeting such standards is difficult. Checklists and instructions are tough to follow often resulting in low andinconsistent compliance. Scientific journals and societies as well as the National Institutes of Health are now activelyproposinggeneralguidelinestoaddressreproducibilityissues,particularlyinthereportingofmethods (e.g., www.cell.com/star-methods), but the trickier part will be to train the biomedical community to usethesestandardstoeffectivelyimprovehowscientificmethodsarecommunicated. Tosupportnewstandardsinmethodsreporting,specificallytheRRIDstandardforRigorandTransparencyof KeyBiologicalResources,weproposetobuildSci-Scoreatextminingbasedtoolsuitetohelpauthorsmeetthe standard. Sci-Score will provide an automated check on compliance with the RRID standard already implementedbyover100journalsincludingCell,JournalofNeuroscience,andeLife.TheinnovationbehindSci- scoreistheprovisionofascore,whichcanbeobtainedbyindividualinvestigators,whichreflectsanumerical validationofthequalityoftheirmethodsreporting.Wepositthatthescorewillserveasatoolthatinvestigators andjournalscanusetocompetewiththemselvesandeachother,orintheveryleastallowthemtoseehow closetheyaretotheaverageinmeetingqualityrequirements. Recently, our group has developed a text mining algorithm that has now been successfully been used to detect software tools and databases from the SciCrunch Registry in published papers. Digital tools are one of four resource types that the RRID standard identifies. We propose to extend this approach to the other types of entities: antibodies, cell lines and model organisms. Resource identification along with other quality metrics twill be used to train an algorithm to score the overall quality of the methods document. If successful, the tool could be used by editors, reviewers, and investigators to improve the number of RRIDs, therefore the quality of descriptors of key biological resources in published papers. This SBIR project will build a set of algorithms similar to the resource finding pipeline and develop it into an industrial robust and reconfigurable software system. Our Phase I specific aims include to 1) creating gold sets of data for each resource type and training a set of algorithms for each resource type; 2) designing and evaluating the scoring system; 3) designing and evaluating a report generating system based on the previous aims. In Phase II, we will develop a scalable backend infrastructure to serve the needs of scientific publishers and research community.

Public Health Relevance

Standardsforscientificmethodsreportingareabsolutelycriticaltoproducingreproduciblescience,butmeeting suchstandardsisdifficult.Checklistsandinstructionsaretoughtofollowoftenresultinginlowandinconsistent compliance.Tosupportnewstandardsinmethodsreporting,specificallytheRRIDstandardforRigorandTransparency,weproposetobuildSci-Score textminingbasedtoolsuitetohelpauthorsmeetthestandard.Sci-Score willprovideanautomatedcheckoncompliancewiththeRRIDstandardimplementedbyover100journalsincludingCell,JournalofNeuroscience,andeLife.Sci-Scorewillprovideascoreratingthequalityof methodsreportinginsubmittedarticles,whichprovidesfeedbacktoauthors,reviewersandeditorsonhowtoimprovecompliancewithRRIDsandotherstandards.

Agency
National Institute of Health (NIH)
Institute
Office of The Director, National Institutes of Health (OD)
Type
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
1R43OD024432-01
Application #
9345707
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Watson, Harold L
Project Start
2017-08-03
Project End
2018-08-02
Budget Start
2017-08-03
Budget End
2018-08-02
Support Year
1
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Scicrunch, Inc.
Department
Type
DUNS #
080111340
City
San Diego
State
CA
Country
United States
Zip Code
92122