High throughput experimental methods have accelerated biomedical research dramatically. Approaches such as microarray analysis, genome-wide association studies (GWAS), deep sequencing and brain imaging reduce bottlenecks in data generation and collection. Understanding the biological significance of high throughput data, however, is a major challenge 1. As pointed out by Bota and Swanson, it is now """"""""far beyond the grasp of individual investigators, no matter how brilliant, to remember, evaluate, and synthesize the neuroscience literature, even in restricted domains like network structure, physiology, or chemistry"""""""" 2. We argue that a key part of the problem is insufficient support for drawing high dimensional functional relationships based on high throughput experimental data in the context of existing literature and data. Prevailing search solutions, such as PubMed/Google Scholar, are mainly designed for retrieving the most relevant information efficiently but not for explorative hypothesis development. These solutions lack several key functionalities that our proposed system will provide, functionalities required for understanding the biology of high throughput data through literature and database explorations that aim at hypothesis development: Overview of Medline search results in familiar biological contexts to facilitate exploration: Presenting the search results in graphic overviews reflecting inherent biological relationships of the retrieved records will be more effective than a linear list of potentially relevant records alone. Such overviews, ideally from multiple biological contexts, should also support efficient interactive exploration of attribute data and pattern associations for deriving non-obvious relationships from multiple perspectives. Query support for different algorithms, biological entities and data sources: One retrieval algorithm will not fit all situations. Biological entities such as gene IDs and genomic locations need to be supported for Medline queries. The Medline database needs to be supplemented by external data sources such as ontology, pathway, and various databases containing curated information derived from experimental data. Open architecture for third party plug-ins and cross-application function integration: The support of third party data and function plug-ins are needed to enhance the functionality and the adaptation of a solution. Open architecture will enable the use of intermediate data and/or functions from other solutions. Incorporating these functions, we propose to develop a system called PubViz that will more effectively support neurobiologists'needs for developing hypotheses on molecular mechanisms underlying major mental disorders through integrated exploration of literature and data related to high throughput experimental results. We will also conduct systematic needs assessments and user tests to ensure that functions we develop match users'needs effectively. Building on our existing component function prototypes, PubViz will provide a query and analysis environment that exceeds other systems in helping scientists work toward formulating hypotheses. It will integrate Medline search results with data and information from external resources and situate relationships visually and interactively in multiple biological contexts that are useful and usable. Creating these combined innovations and human-computer interface (HCI) designs is non-trivial but is feasible given our pilot work and experience in visual Medline exploration solution development, data analysis and integration and usability and usefulness studies. Additionally, focusing this project on neurobiology and mental disorders, a research domain in which we have extensive experience will help us address critical user needs and functionalities more effectively. Moreover, the solution we develop should be adaptable to other biomedical research domains.

Public Health Relevance

This project aims at creating a cross-domain literature and data exploration system that will match and effectively support neurobiologists'needs for exploring mechanisms of major depression, bipolar disorder and schizophrenia based on high throughput experimental data and Medline literature. We expect the proposed system can greatly promote the understanding of molecular mechanisms underlying mental disorder and the development of new therapeutic strategies. Our proposed system can be easily extended into other areas of biomedical research.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Schools of Medicine
Ann Arbor
United States
Zip Code
Mirel, Barbara; Kumar, Anuj; Nong, Paige et al. (2016) Using Interactive Data Visualizations for Exploratory Analysis in Undergraduate Genomics Coursework: Field Study Findings and Guidelines. J Sci Educ Technol 25:91-110
Mirel, Barbara; Görg, Carsten (2014) Scientists' sense making when hypothesizing about disease mechanisms from expression data and their needs for visualization support. BMC Bioinformatics 15:117
Mirel, Barbara; Song, Jean; Tonks, Jennifer Steiner et al. (2013) Studying PubMed usages in the field for complex problem solving: Implications for tool design. J Am Soc Inf Sci Technol 64: