One of the most controversial claims in linguistic theory is the existence of an induction problem for language learning: the data that children are exposed to in learning language is inadequate to yield the knowledge of language children eventually attain. One solution to this problem is that children are born with some knowledge specifically about language, known as Universal Grammar (UG), that allows them to hypothesize the correct linguistic knowledge from the available data. Recent debates in the language learning literature have questioned both the existence of induction problems and the possibility of ever disproving the existence of UG. This project specifies a concrete methodology for doing both of these, by drawing from recent advances in experimental and computational technologies, and incorporating aspects of theoretical syntax, psycholinguistic experimentation, and Bayesian modeling. The first stage of the project quantifies the input available to children for various linguistic phenomena, using transcripts of child-directed speech to assess exactly how often children encounter certain constructions. The second stage quantifies the knowledge of language children attain, using experimental syntax techniques to gauge adult linguistic knowledge. The mark of a potential induction problem is when a construction that appears relatively infrequently in child-directed speech is found to be relatively acceptable for adults, since this means children must decide that construction is part of their language without encountering it very often (or sometimes, at all). The third stage of the project uses psychologically-motivated Bayesian models of language learning to evaluate what is required to solve these potential induction problems, and importantly whether language-specific knowledge is necessary to do so. The project begins the search for induction problems with a set of phenomena that UG advocates unequivocally agree must be part of UG: constraints on long distance dependencies, also known as island constraints.

This project provides a methodology for addressing one of the central debates in language and learning and demonstrates how to apply this methodology to complex phenomena central to linguistic theory, such as island constraints. By connecting theoretical and experimental work in linguistics with computational modeling, this project will yield results not achievable from each of these sub-fields individually and impact the methodological norms in both theoretical syntax and computational models of language acquisition. As this research refines understanding of language learning and necessarily innate knowledge, this project will hopefully inspire productive collaboration across the traditional divides of theory, experimentation, and computation, both within UCI and across the field as a whole. This, in turn, will help us understand how children overcome the difficulties inherent in language learning. An additional feature of the project is the creation of a database of conversational speech that has structures of theoretical interest explicitly marked, which will undoubtedly be a valuable research tool for both theoretical linguists and computational modelers for years to come. The availability of this tool will stimulate fruitful scientific research on phenomena central to linguistic theory that has been difficult or impossible to do previously, due to lack of available data. Moreover, this project integrates research and education by providing hands-on research experience in linguistics to undergraduate students each year, thereby fueling the recruitment of future linguists from the economically and ethnically diverse population of students at a large public university.

Agency
National Science Foundation (NSF)
Institute
Division of Behavioral and Cognitive Sciences (BCS)
Type
Standard Grant (Standard)
Application #
0843896
Program Officer
William J. Badecker
Project Start
Project End
Budget Start
2009-03-01
Budget End
2013-02-28
Support Year
Fiscal Year
2008
Total Cost
$176,713
Indirect Cost
Name
University of California Irvine
Department
Type
DUNS #
City
Irvine
State
CA
Country
United States
Zip Code
92697