The field of phonetics has experienced two revolutions in the last century: the advent of the sound spectrograph in the 1950s and the application of computers beginning in the 1970s. Today, advances in digital multimedia, networking and mass storage are promising a third revolution: a movement from the study of small, individual datasets to the analysis of published corpora that are several orders of magnitude larger.
These new bodies of data are badly needed, to enable the field of phonetics to develop and test hypotheses across languages and across the many types of individual, social and contextual variation. However, in contrast to speech technology research, speech science has so far taken relatively little advantage of this opportunity, because access to these resources for phonetics research requires tools and methods that are now incomplete, untested, and inaccessible to most researchers.
This project fills this gap by integrating, adapting and improving techniques developed in speech technology research, mainly forced alignment of digital audio with phonetic representations derived from orthographic transcripts. The research will help the field of phonetics to enter a new era: conducting research using very large speech corpora, in the range from hundreds of hours to hundreds of thousands of hours.