The study of phonological development has important implications for the diagnosis and treatment of language disorders, models of the biological bases of language production, the teaching of second languages, and the general advancement of linguistic theory. Recents advances in computational power make it possible for researchers to link high quality digital recordings to phonological and phonetic transcriptions. Using standards such as Unicode, IPA, and XML, the CHILDES database project ( now provides universal access to large corpora of transcripts linked to audio for students of both first and second language acquisition, along with a wide array of tools for lexical, syntactic, and discourse analysis. However, the CHILDES Project has not yet built effective tools for phonological and phonetic analysis. We will close this gap by developing a new Java-based program called Phon that interfaces with the CHILDES transcription format. Phon provides: (1) easy user-controlled utterance boundary marking, (2) an input method for Unicode IPA transcription of child forms, (3) automatic alignment of segments in child forms to waveform regions, (4) automatic insertion of the IPA form for adult target words, (5) automatic alignment of child forms to the adult targets for both segmental and prosodic levels, (6) tools for querying the database, and (7) tools for composing output reports. Phon will be configured to run either locally or over the web as a Java WebStart application. The construction of the new database will be supported by a group of 26 researchers who have agreed to contribute already collected and transcribed corpora from children learning 17 different languages. Subjects include bilingual children, normally-developing monolinguals, and children with language disorders. The data will be structured to facilitate testing of models regarding babbling universals, variant paths in segmental and prosodic development, markedness effects, prosodic context effects, segmentation patterns, statistical learning, frequency effects, interlanguage transfer, diagnosis of disability, stuttering patterns, disfluency patterns, and the effects of morphology and syntax. Benchmarks will be established to emphasize the direct competitive teasting of competing hypotheses from alternative theoretical and methodological positions.

National Institute of Health (NIH)
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Research Project (R01)
Project #
Application #
Study Section
Language and Communication Study Section (LCOM)
Program Officer
Mccardle, Peggy D
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
Schools of Arts and Sciences
United States
Zip Code
Byun, Tara McAllister; Rose, Yvan (2016) Analyzing Clinical Phonological Data Using Phon. Semin Speech Lang 37:85-105
Rose, Yvan; Stoel-Gammon, Carol (2015) Using PhonBank and Phon in studies of phonological development and disorders. Clin Linguist Phon 29:686-700
MacWhinney, Brian (2014) Challenges facing COS development for aphasia. Aphasiology 28:1393-1395
Macwhinney, Brian (2014) What we have learned. J Child Lang 41 Suppl 1:124-31
Arbib, Michael A; Bonaiuto, James J; Bornkessel-Schlesewsky, Ina et al. (2014) Action and language mechanisms in the brain: data, models and neuroinformatics. Neuroinformatics 12:209-25
Miyata, Susanne; MacWhinney, Brian; Otomo, Kiyoshi et al. (2013) Developmental Sentence Scoring for Japanese (DSSJ). First Lang 33:200-216
Andreu, Llorenç; Sanz-Torrent, Mònica; Legaz, Lucia Buil et al. (2012) Effect of verb argument structure on picture naming in children with and without specific language impairment (SLI). Int J Lang Commun Disord 47:637-53
Andreu, Llorenç; Sanz-Torrent, Monica; Guàrdia Olmos, Joan et al. (2011) Narrative comprehension and production in children with SLI: an eye movement study. Clin Linguist Phon 25:767-83
Sagae, Kenji; Davis, Eric; Lavie, Alon et al. (2010) Morphosyntactic annotation of CHILDES transcripts. J Child Lang 37:705-29
MacWhinney, Brian; Wagner, Johannes (2010) Transcribing, searching and data sharing: The CLAN software and the TalkBank data repository. Gesprachsforschung 11:154-173

Showing the most recent 10 out of 12 publications