Mastering spoken and written discourse is vitally important in human cognitive development, and it is increasingly recognized that this mastery depends on the implicit knowledge of "linguistic probabilities"--the likelihoods of types of expressions occurring in ordinary language use. For example, psychologists have shown that while reading or hearing a sentence, people instantaneously anticipate the more likely of two types of continuations, even when they are semantically equivalent. For another example, researchers of the historical and social dimensions of language have demonstrated that speakers of different regional varieties of the same language (such as British and American English) are characterized by different models of their speech probabilities, and that these probabilities have been changing over historical time. Finally, developmental studies have revealed children's sensitivity to the statistical regularities of their linguistic environment and their mastery of the variable higher-level linguistic structures that occur in spontaneous discourse. Combing these psychological, historical/social, and developmental perspectives, the present project will investigate how the implicit knowledge of linguistic probabilities develops in the individual and in historically diverging groups of speakers. The project will make use of a common theoretical framework for studying linguistic probabilities as well as a common set of semantically equivalent types of linguistic expressions which differ in their linguistic probabilities. The latter, called "syntactic alternations", are alternative ways of paraphrasing the same message (such as "give her a book/give a book to her" or "the woman's shadow/the shadow of the woman"). The project will enlist an international team of experts to conduct on-site field research of the same syntactic alternations in three suites of studies: (i) parallel observational studies and experiments across groups of speakers of closely related dialects of English to investigate the varying probabilities of syntactic alternations in speaking or writing and their effect on understanding, (ii) studies of how probabilistic models of higher-level linguistic structures from the spontaneous spoken language of children and their caretakers develop over the time-course of primary language learning, and (iii) studies of how the predictors of probabilistic changes in the same high-level linguistic structures develop in historical time.

The project has unusual intellectual scope, because it applies multiple methods of expert collaborators that are seldom brought together within an integrated theoretical approach. New datasets built for this project will be made publicly available to all researchers. This work also has potential applications in reading, second language education and language impairment. Working as a multidisciplinary team on this central set of fundamental linked problems in the comprehension, production, and development of spoken and written discourse will more rapidly advance the growing convergence of probabilistic approaches to language in all of the language sciences, including computer science, communication engineering, psychology, and linguistics.

Agency
National Science Foundation (NSF)
Institute
Division of Behavioral and Cognitive Sciences (BCS)
Type
Standard Grant (Standard)
Application #
1025602
Program Officer
William Badecker
Project Start
Project End
Budget Start
2010-09-15
Budget End
2015-02-28
Support Year
Fiscal Year
2010
Total Cost
$274,997
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305