Annotating the function of all genes in the human genome is a formidable task, and the biological community's collective progress to date represents only the earliest beginnings of this process. Of all the entries in the Entrez Gene database, almost 80% have five or fewer linked references in PubMed, and almost 50% have no linked references. Addressing this challenge requires not only continued effort, but also new models of functional annotation. Currently, the process of systematically annotating gene function primarily involves large-scale efforts by the model organism community and genome annotation centers. These annotation pipelines typically utilize a staff of curators to manually or semi-manually review the biomedical literature. Although well-trained and productive, the curation community is small relative to the scale of knowledge being produced, resulting in a gap between curated data and published knowledge. This proposal describes an effort called the Gene Wiki, an initiative designed to apply the concept of """"""""community intelligence"""""""" to gene annotation. The Gene Wiki invites and empowers the entire community to participate directly in the gene annotation process. The resulting community-reviewed gene-specific review articles serve as a complementary resource to the traditional curator-reviewed databases. The pilot project creating the Gene Wiki was quite successful, attracting a critical mass of readers, editors, and content. This proposal extends the Gene Wiki along three specific aims. First, new content will be added to make the Gene Wiki pages more information-rich, and two mechanisms for updating content will be created to ensure that the Gene Wiki stays timely. These steps will ensure that the critical mass of users will be maintained and enlarged in the future. Second, the Gene Wiki will be integrated with WikiTrust, a system that enables readers to quickly and visually evaluate the trustworthiness of Gene Wiki content. These reliability metrics will be based on systematic analysis of the editing history of each Gene Wiki article. Third, the unstructured text in the Gene Wiki will be translated to structured knowledge for downstream data mining.
This aim will be achieved by collaborating with the traditional curator community and with the biomedical ontology community. Successful completion of these three specific aims will greatly enhance the utility of the Gene Wiki to the scientific community, and also serve as an illustration of the power of community intelligence applied to biomedical research.

Public Health Relevance

The Gene Wiki is an initiative to adapt the principle of community intelligence to the goal of understanding the function of human genes. Successful completion of this work will result in a more complete and up-to-date understanding of how specific genes affect biological systems and human health.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Scripps Research Institute
La Jolla
United States
Zip Code
Froimchuk, Eugene; Jang, Younghoon; Ge, Kai (2017) Histone H3 lysine 4 methyltransferase KMT2D. Gene 627:337-342
Wang, Jie; Lee, Jessica; Liem, David et al. (2017) HSPA5 Gene encoding Hsp70 chaperone BiP in the endoplasmic reticulum. Gene 618:14-23
Griffith, Malachi; Spies, Nicholas C; Krysiak, Kilannin et al. (2017) CIViC is a community knowledgebase for expert crowdsourcing the clinical interpretation of variants in cancer. Nat Genet 49:170-174
Kumar, Rakesh; Sanawar, Rahul; Li, Xiaodong et al. (2017) Structure, biochemistry, and biology of PAK kinases. Gene 605:20-31
Ghaleb, Amr M; Yang, Vincent W (2017) Kr├╝ppel-like factor 4 (KLF4): What we currently know. Gene 611:27-37
Chen, Kong; Kolls, Jay K (2017) Interluekin-17A (IL17A). Gene 614:8-14
Putman, Tim E; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian et al. (2017) WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata. Database (Oxford) 2017:
Lin, Dasheng; Alberton, Paolo; Caceres, Manuel Delgado et al. (2017) Tenomodulin is essential for prevention of adipocyte accumulation and fibrovascular scar formation during early tendon healing. Cell Death Dis 8:e3116
Hanukoglu, Israel; Hanukoglu, Aaron (2016) Epithelial sodium channel (ENaC) family: Phylogeny, structure-function, tissue distribution, and associated inherited diseases. Gene 579:95-132
Hettne, Kristina M; Thompson, Mark; van Haagen, Herman H H B M et al. (2016) The Implicitome: A Resource for Rationalizing Gene-Disease Associations. PLoS One 11:e0149621

Showing the most recent 10 out of 79 publications