Both perceptual and acoustic studies of children's speech independently suggest that phonological contrasts are continuously refined during acquisition. This paper considers two traditional acoustic features for the 's'-vs.-'sh' contrast (centroid and peak frequencies) and a novel feature learned from data, evaluating these features relative to perceptual ratings of children's productions. Productions of sibilant fricatives were elicited from 16 adults and 69 preschool children. A second group of adults rated the children's productions on a visual analog scale (VAS). Each production was rated by multiple listeners; mean VAS score for each production was used as its perceptual goodness rating. For each production from the repetition task, a psychoacoustic spectrum was estimated by passing it through a filter bank that modeled the auditory periphery. From these spectra centroid and peak frequencies were computed, two traditional features for a sibilant fricative's place of articulation. A novel acoustic measure was derived by inputting the spectra to a graph-based dimensionality-reduction algorithm. Simple regression analyses indicated that a greater amount of variance in the VAS scores was explained by the novel feature (adjusted R2 = 0:569) than by either centroid (adjusted R2 = 0:468) or peak frequency (adjusted R2 = 0:254).
|Original language||English (US)|
|Number of pages||5|
|Journal||Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH|
|State||Published - 2017|
|Event||18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 - Stockholm, Sweden|
Duration: Aug 20 2017 → Aug 24 2017
Bibliographical noteFunding Information:
Work described in this paper was supported by NIH grant DC02932 to Edwards, Beckman, and Munson. We thank the Learning to Talk teams at UW-Madison and UMN, especially Hannele Nicholson, Bianca Schroeder, Rebecca Hatch, and Clare Kramer, for their work in recruiting and testing subjects in the production task, in annotating the recordings, and in recruiting and running subjects in the listening task.
- Laplacian Eigenmaps
- Phonological Acquisition
- Sibilant Fricatives