Most biomedical text mining systems target only text information and do not provide intelligent access to other important data such as Figures. More than any other documentation, figures usually represent the "evidence" of discovery in the biomedical literature. Full-text biomedical articles nearly always incorporate images that are the crucial content of biomedical knowledge discovery. Biomedical scientists need to access images to validate research facts and to formulate or to test novel research hypotheses. Evaluation has shown that textual statements reported in the literature are frequently noisy (i.e., contain "false facts"). Capturing images that are essentially experimental "evidence" to support the textual "fact" will benefit biomedical information systems, databases, and biomedical scientists. We are developing a biomedical literature figure search engine BioFigureSearch. We develop innovative algorithms and models in natural language processing, image processing, machine learning and user interfacing. The deliverables will be novel biomedical natural language figure processing (bNLfP) algorithms and iBioFigureSearch allowing biomedical scientists to access figure data effectively, and open-source tools that will enhance biomedical information retrieval, summarization, and question answering. The bNLfP algorithms we will be developing can be applied or integrated into other biomedical text-mining systems.
This project proposes innovative algorithms and models in natural language processing, image processing, machine learning, and user interfacing, to return figures in response to biomedical queries. It is anticipated that the algorithms, models, and tools developed will significantly enhance biomedical scientists'access to figures reported in literature, and thereby expedite biomedical knowledge discovery.
|Polepalli Ramesh, Balaji; Sethi, Ricky J; Yu, Hong (2015) Figure-associated text summarization and evaluation. PLoS One 10:e0115671|
|Yin, Xu-Cheng; Yang, Chun; Pei, Wei-Yi et al. (2015) DeTEXT: A Database for Evaluating Text Extraction from Biomedical Literature Figures. PLoS One 10:e0126200|
|Zhang, Qing; Yu, Hong (2014) Computational approaches for predicting biomedical research collaborations. PLoS One 9:e111795|
|Liu, Feifan; Yu, Hong (2014) Learning to rank figures within a biomedical article. PLoS One 9:e61567|
|Li, Yanpeng; Yu, Hong (2014) A robust data-driven approach for gene ontology annotation. Database (Oxford) 2014:bau113|
|Polepalli Ramesh, Balaji; Houston, Thomas; Brandt, Cynthia et al. (2013) Improving patients' electronic health record comprehension with NoteAid. Stud Health Technol Inform 192:714-8|
|Liu, Feifan; Moosavinasab, Soheil; Agarwal, Shashank et al. (2013) Automatically identifying health- and clinical-related content in wikipedia. Stud Health Technol Inform 192:637-41|
|Zhang, Qing; Yu, Hong (2013) CiteGraph: a citation network system for MEDLINE articles and analysis. Stud Health Technol Inform 192:832-6|
|Ramesh, Balaji Polepalli; Prasad, Rashmi; Miller, Tim et al. (2012) Automatic discourse connective detection in biomedical text. J Am Med Inform Assoc 19:800-8|
|Liu, Feifan; Moosavinasab, Soheil; Houston, Thomas K et al. (2012) MedTxting: learning based and knowledge rich SMS-style medical text contraction. AMIA Annu Symp Proc 2012:558-67|
Showing the most recent 10 out of 16 publications