In FY19, NLM made significant progress on enhancing its National Biomedical Information Services as it: Added 1.35 million citations to the PubMed bibliographic database (now with 30 million citations to biomedical journal articles) and developed updated search technology and mobile optimization to be included in a new PubMed in late FY19. Added 600,000 articles to PubMed Central (PMC), which provides free public access to 5.5 million full text journal articles from research supported by NIH, other federal agencies, and private and international research funders. Launched the PMC Associated Data box to show data citations, availability, and supplementary materials to facilitate discovery of openly available datasets. Completed a multi-year partnership with Wellcome Trust to make thousands of articles from historical biomedical journals freely available via PMC Text Mining Collections. Partnered via the National Network of Libraries of Medicine (NNLM: more than 7,000 health science libraries and information centers) with public libraries to: educate consumers about precision medicine and the NIH All of Us Research Program in communities that are underrepresented in biomedical research; train health sciences librarians in research data management and data science; encourage citizen science; and improve access to high-quality science-based medical information online. Trained via NNLM more than 100,000 people (3,000 educational activities; 5,000 resource demonstrations); Supported 335 outreach and training activities for 200,000 users. Added 32,000 new clinical research studies and 6100 new results summaries to (total of 318,000 studies and 39,000 summaries of study results and adverse events information). Contributed to implementation of regulations (42 CFR Part 11) and NIH policy for clinical trial transparency. Improved access to reference information on environmental health and toxicology by migrating information from TOXNET (e.g., TOXLINE, Hazardous Substances Data Bank, and Haz-Map) to PubMed and PubChem. Migrated MedlinePlus (information on medical tests, drugs, healthy recipes, and educational videos in 60 languages) to the cloud and doubled lab tests to 157. Via MedlinePlus Connect, linked EHRs and patient portals to consumer health information from MedlinePlus using standard codes and vocabularies. Coordinated clinical data standards for HHS and provided tools for exchanging, using, and facilitating interoperability of machine-readable clinical health data: added 240,000 implantable devices to AccessGUDID database (links FDA and SNOMED codes for medical devices) to support certified EHR requirements; Updated SNOMED CT to ICD10CM map in the Interactive Map Assisted Generation of ICD Codes (i-MAGIC) tool; Deepened support of CMS clinical quality measures in Value Set Authority Center; augmented NIH Common Data Element Repository with tools and content to facilitate data sharing, aggregation, collaboration, comparison, and usability. Enhanced drug information resources: Drug Information Portal, DailyMed (drug labeling information from package inserts for 110,000 drugs, with mobile access), Pillbox (rapid identification of unknown solid-dosage medications based on physical characteristics and images), LiverTox (clinical, diagnostic and research information and case registry on liver injury due to drugs, herbals and dietary supplements), and LactMed (effects of 1,300 drugs, dietary supplements, and diagnostic agents on breastfeeding mothers and their nursing infants; available in NCBI Books). DailyMed and Pillbox are linked to NLM's RxNorm standard drug names. Updated training classes for the Disaster Information Specialist Program that enables information professionals and librarians to be health information responders in their communities. NLM provides tools for hazardous materials and chemical, biological, radiological, and nuclear incidents (WISER, CHEMM, and REMM). Added information on fourth generation nerve agents per White House Office of Science and Technology Policy request and new chemical decontamination guidelines per HHS Office of the Assistant Secretary for Preparedness and Response request. Added new WISER features to view data sources and export/share protective distance maps. Added 380 million sequences to GenBank (the NIH genetic sequence database which contains all publicly available DNA sequences), 42 million sequence records (a 26% increase) to RefSeq (database of reference sequences including genomic, transcript, and protein) and 80,000 human genome sequence variants to ClinVar (archive of reports of the relationships among human variations and phenotypes). Via Sequence Data Delivery Project moved 5 petabytes of public SRA data to two cloud vendors (NIH STRIDES initiative) and released BLAST and prokaryotic genome annotation tools that were packaged to support use in cloud or compute center environments. Related work improved: the search experience for users seeking data and information about genomes, proteins and clinical variation; discovery and access of enormous quantities of data from high throughput sequencing; and submission processing and data quality assessment. Collaborated with CDC, FDA, and USDA to use high throughput sequencing to rapidly and accurately identify pathogens causing foodborne illnesses and combat antimicrobial resistance (AMR). Used Pathogen Detection pipeline to process genome sequence data for 100,000 food, factory, farm and other samples to identify sources of human illnesses such as Salmonella, E. coli, and Listeria; Improved turnaround time for results to the FDA in 24 hours, enabling the first real-time US foodborne pathogen surveillance system (used by FDA to support more than 370 actions intended to protect consumers from foodborne illness). Released a new version of the AMRFinderPlus tool to identify AMR genes and proteins and the new National Database of Antibiotic Resistant Organisms (NDARO) to provide access to AMR data for more than 400, 000 pathogens (part of the White House National Action Plan for Combating Antibiotic-Resistant Bacteria). Combined contemporary and historical inquiry to advance biomedical knowledge via: Circulating Now blog (6,000 subscribers and 338,000 followers); 40 NLM traveling exhibitions (260,000 visitors); and hosting seven Michael E. DeBakey Fellows in the History of Medicine who conducted research using the world-renowned NLM historical collections. Launched workforce transformation initiative, Data Science NLM Training Program, to hone NLM staff skills. More than 750 staff completed survey to assess skills in 10 data science competencies and received individual training plans linked to NLMs Data Science Course Catalog. Provided access to health services research information for researchers, practitioners, and policy makers via NICHSR ONESearch to: identify trends and gaps in funded research (HSRProj database of 36,000 active health services research projects from 380+ funding sources); locate 1800 essential datasets, data collection instruments and related literature (Health Services Research Resources database); and find current news, reports and data (PHPartners and HSR Information Central portals).

National Institute of Health (NIH)
National Library of Medicine (NLM)
Scientific Computing Intramural Research (ZIH)
Project #
Application #
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
National Library of Medicine
Zip Code
Huser, Vojtech; Kahn, Michael G; Brown, Jeffrey S et al. (2018) Methods for examining data quality in healthcare integrated data repositories. Pac Symp Biocomput 23:628-633
Read, Kevin B; Amos, Liz; Federer, Lisa M et al. (2018) Practicing what we preach: developing a data sharing policy for the Journal of the Medical Library Association. J Med Libr Assoc 106:155-158
Fiorini, Nicolas; Canese, Kathi; Starchenko, Grisha et al. (2018) Best Match: New relevance search for PubMed. PLoS Biol 16:e2005343
Fain, Kevin M; Rajakannan, Thiyagu; Tse, Tony et al. (2018) Results Reporting for Trials With the Same Sponsor, Drug, and Condition in and Peer-Reviewed Publications. JAMA Intern Med 178:990-992
Tuohy, Patricia; Eannarino, Judith (2018) Reading graphic medicine at the National Library of Medicine. J Med Libr Assoc 106:387-390
Xue, Yuan; Xu, Tao; Zhang, Han et al. (2018) SegAN: Adversarial Network with Multi-scale L1 Loss for Medical Image Segmentation. Neuroinformatics 16:383-392
Fernandez-Repollet, Emma; Locatis, Craig; De Jesus-Monge, Wilfredo E et al. (2018) Effects of summer internship and follow-up distance mentoring programs on middle and high school student perceptions and interest in health careers. BMC Med Educ 18:84
Pujar, Shashikant; O'Leary, Nuala A; Farrell, Catherine M et al. (2018) Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation. Nucleic Acids Res 46:D221-D228
Kilicoglu, Halil; Rosemblat, Graciela; Malicki, Mario et al. (2018) Automatic recognition of self-acknowledged limitations in clinical research literature. J Am Med Inform Assoc 25:855-861
Gartrell, Kyungsook; Brennan, Caitlin W; Wallen, Gwenyth R et al. (2018) Clinicians' perceptions of usefulness of the PubMed4Hh mobile device application for clinical decision making at the point of care: a pilot study. BMC Med Inform Decis Mak 18:27

Showing the most recent 10 out of 229 publications