Comparing feature types


Since I am interested in using spectral rather than cepstral features, I ran a series of training experiments where I tried a range of different feature styles to see how much the error varied. The surprising result was that the variation was not significant. I don't know if I believe it!

But here are the results anyway. Note that the baseline is rather high for the NUMBERS95 task - about 6.5% is what others get from embedded training. One problem might be the dictionary (pronunciations) I was using, which was some random one I picked up off the floor.

This work was originally mentioned in my status report of 1997oct03. The gory details are in my NOTES file from this work.


Feature setNet sizedev WERR% (sub/del/ins)
rasta-plp-cepstra243/500/567.0% (4.0/1.3/1.7)
rasta-plp-logspectra405/500/566.6% (3.7/1.2/1.7)
rasta-cepstra 243/500/566.9% (3.8/1.4/1.7)
rasta-logspectra 405/500/566.8% (3.9/1.3/1.6)
plp-cepstra 243/500/566.9% (3.7/1.8/1.4)
plp-logspectra 405/500/567.1% (3.8/1.7/1.6)
cepstra 243/500/566.8% (3.7/1.6/1.5)
logspectra 405/500/567.0% (3.8/1.6/1.6)

5% significance in this test set requires a difference of 0.88% (n=4680), so none of these numbers differ significantly from one another. In each case, I ran 4 iterations of embedded training (i.e. relabelling and retraining) and took the best result.

For your pleasure, here are images of the feature representations under the eight transforms:

Base feature Cepstra Log spectra
rasta-plp
rasta
plp
plain

Updated: $Date: 1999/05/21 00:28:18 $
DAn Ellis <dpwe@icsi.berkeley.edu>
International Computer Science Institute, Berkeley CA