ICSI BN Feature Matrix

Matrix of 2000HU systems trained on 37h showing all combinations of rasta, per-utterance normalization and deltas, both alone and in combination with RNN1:

WERR% WERR%+deltas	alone	RNN combo
rasta	47.2 44.4	32.0 31.8
rasta-norm	45.7 44.4	32.1 32.5
plp	39.8 40.0	30.8 31.2
plp-norm	36.7 37.5	31.1 31.1

WERR%

WERR%+deltas

alone

RNN combo

rasta

47.2

44.4

32.0

31.8

rasta-norm

45.7

44.4

32.1

32.5

plp

39.8

40.0

30.8

31.2

plp-norm

36.7

37.5

31.1

This table shows the overall word-error rate for the following conditions:

MLPs have 2000 hidden units and 54 output units
Nets have 13x9=117 input units for `plain' features, and 26x9=234 input units for features+deltas.
Training set is randomized half of 1996 set i.e. 37h total (same for all nets).
Test set is 32 minute subset of hub4 97 eval.
Noway pruning set to beam 2.0 state_beam 4.0 nhyps 7 prob_min 0.00020

The dimensions varied are:

Rasta filtering (rasta) vs. plain plp features (plp)
Per-utterance feature normalization (-norm) vs. no additional normalization
System used alone (alone) vs. system combined with Gary's best LNA system by averaging the log likelihoods (RNN combo)
Features alone used (WERR%) vs. features+deltas used (WERR%+deltas). Note that using deltas almost doubles the number of parameters in the net, but the training set size is kept the same.

Dan Ellis <dpwe@icsi.berkeley.edu>
Modified: $Date: 1998/06/22 06:16:02 $