Hypothesis generation is the process by which people think of explanations (e.g., medical diagnoses) to account for patterns of data (e.g., symptoms) observed in the environment. Although hypothesis generation is an important component of many professional domains ranging from generating possible diagnoses in medicine, generating causes of going concern in auditing, and interpretations of satellite imagery in intelligence, it is also quite common in a number of non-professional contexts (e.g., explaining peoples' behavior in social contexts). Although scientists are beginning to understand how hypothesis generation, evaluation, and testing processes operate in discrete time frames, little is known about how these processes unfold when carried out across time. For instance, a physician does not receive the full pattern of data to be explained upfront. Rather, the full pattern of data unfolds over time (e.g., as test results are received). This research develops innovative empirical methodologies and a novel computational model to better understand the temporal dynamics of hypothesis generation. The resulting model will transform a current model of hypothesis generation by incorporating mechanisms from a state-of-the-art model of dynamic human memory processes. The resulting hybrid model is comprehensive and accounts for a wide array of phenomena underlying hypothesis generation and a multitude of processes reliant upon hypothesis generation such as information search and hypothesis evaluation.

In service of the theoretical goals of the proposed research, the research team advances a novel empirical methodology that allows one to measure what people are thinking about, at any point in time, by assessing eye movements across carefully constructed visual search arrays. The main advantage of this technique is that it will be less invasive and more sensitive than existing procedures. This methodology is not tied to any one domain or type of task, so the methodology has the potential to support work in a host of domains across not only decision making, but cognitive science more generally.

The project has implications for the design of decision support systems. The work reveals situational hazards resulting in biased or impoverished hypothesis generation and fosters new understandings of the underlying memory dynamics contributing to real-world hypothesis generation.

Project Report

The NSF-support provided an opportunity to develop a novel extension of the HyGene computational model to better understand the temporal dynamics of data maintenance and hypothesis generation. This extension to the HyGene model of hypothesis generation was accomplished by incorporating mechanisms from a state of-the-art model of dynamic human working memory processes. Moreover, the study of dynamic hypothesis generation also required an innovative empirical methodology for understanding how the dynamics of data acquisition influence hypothesis generation. Time-Based Processes Influence Hypothesis Generation In hypothesis-generation tasks, data are often acquired serially, one after another. As a result, each datum is experienced in a position relative to the rest of the data. Although we should be unaffected by data order, on an intuitive level, the order in which we encounter data almost certainly affects the hypotheses we generate. Both primacy bias (early data has a larger influence than later data) and recency bias (later data has a larger influence than early data) have been demonstrated in decision making (Hogarth & Einhorn, 1992). Our recent work has demonstrated order effects in peoples’ hypothesis generation. Lange, Thomas, and Davelaar (2012b) presented participants with one informative piece of data among three uninformative pieces of data sequentially, such that the position of the useful piece of data was manipulated to appear in each of four possible serial positions—allowing the influence of the data at each serial position to be measured. Later data contributed more to participants’ choice of diagnosis than early data, demonstrating a recency effect in hypothesis generation. As can be seen in Figure 1, the HyGene model captures the recency trend evidenced in the data. [Figure 1] The speed of data acquisition also influences diagnosis. Lange, Thomas, Buttaccio, Illingworth, and Davelaar (2013) presented a sequence of five symptoms to participants and asked them to select the more likely of two disease hypotheses. The sequence of symptoms was such that the first two symptoms suggested one hypothesis (Disease A) and the last two symptoms suggested the other hypothesis (Disease B). Under the slow rate of symptom presentation, a recency bias obtained in diagnosis, whereas under the fast rate of presentation, the recency bias attenuated and a primacy bias emerged. Recent findings have suggested that a primacy bias is increasingly likely to obtain in hypothesis generation (i.e., diagnosis) as the complexity of the task increases (Lange, Thomas, & Davelaar, 2012a, 2012b) or as a function of increased working memory capacity (Lange, Davelaar, & Thomas, 2013). In sum, the same datum can have different impacts on which hypotheses are generated depending on where it is presented in the sequence. Measuring Dynamic Hypothesis Generation: Memory Actiavtion Capture (MAC) Methodology In order to investigate dynamic hypothesis generation, the NSF supported the development of a novel empirical methodology which allows one to measure what people are thinking about by assessing eye movements across carefully constructed visual search arrays. The procedure is based on the idea that eye movements are biased by the content of working memory (Soto, Heinke, Humphreys, & Blanco, 2005; Soto & Humphreys, 2007). The MAC procedure capitalizes on this bias to measure working-memory content deployable in complex cognitive tasks. Specifically, by presenting brief visual arrays (≈500ms) containing task relevant information at various points in a task, differences in the oculomotor guidance towards the items contained in such visual arrays could be taken as evidence regarding the active content of working memory at the time of the array presentations. In this way our methodology can be thought of as an effort to capture snapshots of working memory (or what people are thinking about) across time (Lange, Thomas, Buttaccio, & Davelaar, 2013). In several experiments we have investigated the MAC procedure and have established that the procedure has efficacy (Lange, Buttaccio, Davelaar, & Thomas, 2014; Lange, Thomas, Buttaccio, Davelaar, 2012). For instance, the MAC procedure has been used to demonstrate that greater proportions of initial fixations land on the more likely hypothesis relative to other items in the visual arrays. Moreover, by manipulating the timing of the presentation of the visual arrays, differences in fixations has allowed us to elucidate the underlying dynamic memory activations uderlying hypotheses generation and data maintenance (Lange et al, 2014). Conclusions Hypothesis generation serves as the critical bridge between peoples’ task environment and the decision processes enabling complex and intelligent behaviors. Although people generate good hypotheses to explain patterns of data, their hypothesis generation is impoverished as a result of underlying memory constraints, which leads to systematic biases in beliefs and information search. We believe that the HyGene model may serve as a support tool to aid the diagnostic decision making of professionals by inoculating them from bias and to improve the robustness of existing applications of artificially intelligent classification systems (Thomas et al., 2010).

National Science Foundation (NSF)
Division of Social and Economic Sciences (SES)
Standard Grant (Standard)
Application #
Program Officer
Robert O'Connor
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Oklahoma
United States
Zip Code