The verbatim records of democratic legislatures [or "parliaments", "representative assemblies", "councils", etc.] represent a source of untapped information of unique importance for the study of both democratic societies and language over multiple time scales. For linguists, there is no other place where we have a systematic minute-by-minute record of the spoken word exchanged by a slowly changing and overlapping set of individuals over significant lengths of time. For political scientists, these are also unique sources of deliberations by elected political representatives about the issues of the day. As such, legislative records provide exceptional opportunities for studying the dynamics of language and rhetoric, of democratic politics and representation, and of their interactions, over time scales ranging from minutes to centuries.
The record of a single legislature can, however, run to thousands of pages in a single day. It is impossible for any one person to read, much less absorb or analyze, the entire record of a legislature as quickly as it is produced. Increasing availability of these records in electronic form, however, opens possibilities for various forms of computerized analysis. This multidisciplinary project applies and advances recent developments in computer science, information science, and statistics - for natural language processing in particular and statistical learning from massive databases in general - to the analysis of legislative records from democracies worldwide, illuminating important questions of dynamics of political representation and political rhetoric.
In the first stage of the project, new corpora (linguistic databases) will be developed from legislative records. Using data reduction techniques including scaling, classification, and summarization, novel statistical and computational methodologies for the dynamic analysis of parliamentary language and parliamentary speakers will be developed and refined.
In the second stage of this project, these data and techniques will be applied to questions of importance to linguistics, political science, and social science more generally. In all cases, a wide range of democratic legislatures (from subnational to international), in a wide range of languages, over multiple time scales will be examined. Examples of these important questions include:
What can legislative speech tell us about the role of political parties in a democracy? When do they compete, when do they cooperate, when do they polarize, and on what issues?
What can legislative speech tell us about democratic representation? How and when are new issues incorporated into the agenda of a legislature? When and whom do representatives lead; when and whom do they follow?
What can legislative speech tell us about individual democratic representatives? Do they change their rhetorical behavior in response to citizen preferences, to career motivations, or not at all? Does gender or group identity affect rhetorical choices?
What can legislative speech teach us about language itself? How has the political content of language changed over the last two centuries? Do legislative debates actually involve exchanges of information or persuasion?
How do events and political rhetoric interact? When do events cause a shift in political rhetoric? When does talk about policy or other political change translate into actual change? Is it predictable?
The broader impacts of this multidisciplinary international project include enhanced scientific infrastructure in the form of new software and data for the study of politics and language, enhanced public infrastructure for monitoring what are now impossibly large records of democratic institutions, and new statistical and computational techniques for analyzing large-scale textual databases in general applications. This research will ultimately lead to an increased ability to understand and forecast political and policy changes around the world as well as a greater understanding of how language affects politics, how politics affects language, and how the interaction between them affects democracy.