The goal of this project is to build a software toolkit that will enable a biologist to create, from a collection of on-line articles, a database of protein subcellular localization information that can be queried, browsed, or used to support data-mining activities. We have developed a system, called SLIF, which can harvest fluorescence microscope images from online papers, analyze them using image-processing methods, and annotate them with information appearing in the accompanying textual description. We propose to improve and extend this system so as to produce a robust, comprehensive toolkit for extracting information about subcellular localization from the text and images found in online journals, as well as analyzing, verifying and querying the resulting body of information. ? ? ?
Ahmed, Amr; Arnold, Andrew; Coelho, Luis Pedro et al. (2010) Structured Literature Image Finder: Parsing Text and Figures in Biomedical Literature. Web Semant 8:151-154 |
Coelho, Luis Pedro; Ahmed, Amr; Arnold, Andrew et al. (2010) Structured Literature Image Finder: Extracting Information from Text and Images in Biomedical Literature. Lect Notes Comput Sci 6004:23-32 |
Ahmed, Amr; Xing, Eric P; Cohen, William W et al. (2009) Structured Correspondence Topic Models for Mining Captioned Figures in Biological Literature. KDD 2009:39-48 |
Qian, Yuntao; Murphy, Robert F (2008) Improved recognition of figures containing fluorescence microscope images in online journal articles using graphical models. Bioinformatics 24:569-76 |
Kou, Zhenzhen; Cohen, William W; Murphy, Robert F (2007) A stacked graphical model for associating sub-images with sub-captions. Pac Symp Biocomput :257-68 |