Languages have multiple levels of statistical structure. Certain sounds tend to co-occur in forming words (like the ''st'' sounds in ''stop'' and ''list''), certain words tend to appear together in forming phrases, and so on. It is well-established that infants and children learn these statistics (with apparent ease) in the course of learning their languages, but how these statistics contribute to language learning is an ongoing area of research. The answer to this question is key to understanding the human capacity for language.

With support of the National Science Foundation, Dr. Thiessen is investigating how infants discover the statistical information available in their linguistic environments. His hypothesis is that the statistics learned at one level of language structure (sounds, for example) are used to facilitate the learning of language structures at other levels (words and phrases, for example). To investigate this hypothesis, Dr. Thiessen is conducting a number of laboratory experiments that test the statistical learning abilities of infants and toddlers. Student research training will include a new class-based initiative in which undergraduates work in the community to provide education on language development to parents and daycare centers. The knowledge gained may suggest more effective ways of teaching second languages to adults, and better remediation techniques for children with developmental disorders that impede language learning.

Project Report

This research assessed the role of statistical information in infants' language development, and the nature of the learning mechanisms that make it possible for infants to take advantage of the statistical structure in linguistic input. Prior research had demonstrated that infants are sensitive to a variety of statistical cues, such as the degree to which sounds predict each other. For example, sounds within a word are more predictable than sounds across word boundaries (when a baby hears "pre", it's quite likely that "tty" is coming next, but after a baby hears "pretty" there are many sounds that can potentially occur). This project examined three specific questions about the relation between statistical learning and language development. First, how does statistical information contribute to infants' understanding of word meaning? Second, to what extent are differences in learning from linguistic and non-linguistic stimuli explained by differences in underlying learning mechanisms, vs. differences in the amount of prior experience that infants have with different kinds of stimuli? Third, because spoken language is an inherently cross-modal kind of input (sounds refer to objects in the world), how is learning influenced by the presentation of information in single vs. multiple modalities? To explore the first question, we used a task in which infants learned names for novel objects. We found, replicating prior research, that 15-month-olds typically ignore small differences in sound, even when those differences are perceptible, and meaningful for adult speakers. For example, infants treat both "daw" and "taw" as though they can both refer to the same novel object. However, we found that training could alter this performance. When infants were exposed to the phonemes /d/ and /t/ in distinct contexts (like "doggy" vs. "teddy,"), they were more likely to use the /d/-/t/ distinction. This result suggests that the contexts in which phonemes are distributed, and infants' familiarity with those phonemes, influences how likely they are to treat them as meaningfully different. To explore the second question, we used a task in which infants learn that a set of sequences follow a pattern like ABA (e.g., syllables sequences like "ga ti ga" or "li mo li") or an ABB pattern. Prior experiments indicated that 8-month-old infants are able to learn these patterns more easily for speech than for non-linguistic stimuli like tones or shapes. We replicated this result, and then showed that infants are able to learn from non-linguistic stimuli if they are made more informative (that is, if there is more information differentiating A elements from B elements). This result suggests that one of the reasons that infants are more successful in learning from linguistic stimuli is that they may have more experience attending to the relevant dimensions in speech. To explore the third question, we exposed infants to a stream of speech in which there were no pauses between words. The only cue to word segmentation in the auditory input was the predictability of the sounds within a word (high), compared to the predictability across word boundaries (low). In addition, each word - when it occurred in the speech stream - was paired with a picture of a unique object. We found that adults were more successful in segmenting the speech into words when they were presented with pictures than when they were presented alone, even though the cross-modal input is necessarily more complex. Infants over a year old showed the same pattern. But 8-month-old infants learned equally well when words were presented with or without corresponding images. These results indicate that the ability to take advantage of cross-modal information develops with age, possibly as infants' gain experience that words typically refer to objects in the world.

Agency
National Science Foundation (NSF)
Institute
Division of Behavioral and Cognitive Sciences (BCS)
Application #
0642415
Program Officer
Lawrence Robert Gottlob
Project Start
Project End
Budget Start
2007-08-15
Budget End
2012-07-31
Support Year
Fiscal Year
2006
Total Cost
$450,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213