Stochastic Perceptual Auditory-event-based Models (aka SPAM)
Many physiological and psychoacoustic studies have suggested that human
auditory processes focus on change in speech rather than on regions
with constant or slowly-varying spectral properties. This property may
help to improve the robustness of human speech recognition in the
presence of some kinds of acoustic interference. Stochastic Perceptual
Auditory-event-based Models (SPAMs) were developed by
Nelson Morgan,
Herve Bourlard, Hynek Hermansky and Steve Greenberg to incorporate this
perspective into word models for speech recognition by machines. In a
nutshell, this approach ties together the statistics of non-onset
portions of speech to focus modeling power on the onset decisions.
Preliminary experiments by the Realization Group at the
International Computer Science
Institute in
Berkeley, CA have shown that this approach, when used in combination
with more conventional models, appears to provide improved robustness
for automatic speech recognition in the presence of slowly varying
additive noise.
Papers
-
Stochastic Perceptual Auditory-Event-Based Models
for Speech Recognition
Morgan, N., Bourlard, H., Greenberg, S., and Hermansky, H.,
Intl. Conference on Spoken Language Processing, 1994, pp 1943-1946.
-
Stochastic Perceptual Models of Speech
Morgan, N., Bourlard, H., Greenberg, S., Hermansky, H. and Wu, S.,
IEEE Proceedings of the International Conference on
Acoustics, Speech and Signal Processing, Detroit, Michigan, 1995.
-
Properties of Stochastic Perceptual Auditory-event-based Models for
Automatic Speech Recognition
Su-Lin Wu,
Masters Project, UC Berkeley,
Spring 1995, ICSI Technical Report TR-95-023.
-
SPAM: Experiments with Digit Recognition
Morgan, N., Wu, S., Bourlard, H.,
Proceedings of the Speech Research Symposium,
Baltimore, Maryland, 1995.
-
Digit Recognition with Stochastic Perceptual Models
Morgan, N., Wu, S. and Bourlard, H.,
to appear in Proceedings of Eurospeech 1995, Madrid, Spain.
-
Transition-based Statistical Training for ASR
Morgan, N., Konig, Y., Wu, S. and Bourlard, H., Snowbird ASR Workshop,
Snowbird, Utah, 1995.
-
Stochastic Perceptual Speech Models With Durational Dependence
Bilmes, J., Morgan, N., Wu, S.-L., and Bourlard, H.
Intl. Conference on Spoken Language Processing, 1996,
What is spam?
(courtesy Dan Garcia)
Su-Lin Wu -
November 29, 1995