Global Posterior Probability Estimates as Confidence Measures in an Automatic Speech Recognition System

ICASSP - 2001

Warner Warren

A natural approach to measuring the decision confidence in an ASR system's linguistic outputs is to estimate the a posteriori probability that the outputs are correct given the acoustic inputs. In this work, the Recursive Estimation and Maximization of A posteriori Probability (REMAP) method is used to obtain posterior probability estimates. However, the REMAP estimates do not always adequately model the random process that generates the speech used to test the system. As a result, in spite of the fact that the REMAP model produces reasonable frame (local) level posterior probability estimates, at the word (global) level REMAP estimates sometimes have a tendency to deviate from source distribution over time. This work explores some methods to compensate for this temporal deviation in order to obtain better global decision confidence estimates. The compensation strategies are based on normalization and minimum divergence (maximum entropy) constrained optimization methods.