``Moving beyond the `beads-on-a-string' model of speech,''
Proc. IEEE ASRU Workshop, 1999, to appear.
The notion that a word is composed of a sequence of phone segments,
sometimes referred to as `beads on a string', has formed the basis
of most speech recognition work for over 15 years. However, as more
researchers tackle spontaneous speech recognition tasks, that view is
being called into question. This paper raises problems with the
phoneme as the basic subword unit in speech recognition, suggesting
that finer-grained control is needed to capture the sort of
pronunciation variability observed in spontaneous speech. We offer
two different alternatives -- automatically derived subword units and
linguistically motivated distinctive feature systems -- and discuss
current work in these directions. In addition, we look at problems
that arise in acoustic modeling when trying to incorporate
higher-level structure with these two strategies.
Return to SSLI Lab Publications