There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

This article is concerned with the question of how listeners recognize coarticulated phonemes. The problem is approached from a pattern classification perspective. First, the potential acoustical effects of coarticulation are defined in terms of the patterns that form the input to a classifier. Next, a categorization model called HICAT is introduced that incorporates hierarchical dependencies to optimally deal with this input. The model allows the position, orientation, and steepness of one phoneme boundary to depend on the perceived value of a neighboring phoneme. It is argued that, if listeners do behave like statistical pattern recognizers, they may use the categorization strategies incorporated in the model. The HICAT model is compared with existing categorization models, among which are the fuzzy-logical model of perception and Nearey's diphone-biased secondary-cue model. Finally, a method is presented by which categorization strategies that are likely to be used by listeners can be predicted from distributions of acoustical cues as they occur in natural speech.