No two voices are exactly alike, and speech sounds can vary dramatically when produced by different individuals. A substantial component of this variability stems from anatomical differences between speakers that reflect their age and sex. Although these factors complicate the relationship between acoustic cues and phonetic properties of speech, they provide information from which the listener can determine the age, sex and size of the speaker, referred to as indexical properties.
The aim of the proposed research is to investigate the relationship between indexical and phonetic properties in children's speech through four linked projects. Project 1 involves the construction and acoustic analysis of a vowel database from children ranging in age from 5 through 18 years. The database will provide the materials for experiments investigating the perceptual consequences of age-related changes in speech. In Project 2, natural and modified versions of the recordings will be used to examine the cues that distinguish male from female voices at different ages. Project 3 will investigate the perception of speaker age in children's voices and evaluate the effectiveness of vocal age conversion using synthesis techniques based on models of vocal tract scaling. Project 4 investigates the link between vowel identification and indexical properties, requiring listeners to provide vowel identification responses together with judgments of the perceived sex and age of the speaker. Pattern recognition models will be implemented using acoustic measurements from the database to model the statistical relationships between the acoustic properties of children's speech as a function of age and sex, and to predict listeners' responses in the perceptual experiments.
This research will provide valuable information on speech development and the processes by which listeners extract linguistic and indexical information from children's speech. The findings could provide useful information for automatic speech recognition systems applied to children's speech, reveal effective strategies for synthesizing children's voices, and serve as normative data in clinical studies of disordered speech.