ICSI BN Feature Matrix
Matrix of 2000HU systems trained on 37h showing all combinations of rasta, perutterance normalization and deltas, both alone and in combination with RNN1:
WERR%

alone

RNN combo

rasta

47.2

32.0

rastanorm

45.7

32.1

plp

39.8

30.8

plpnorm

36.7

31.1

This table shows the overall worderror rate for the following conditions:

MLPs have 2000 hidden units and 54 output units

Nets have 13x9=117 input units for `plain' features, and 26x9=234 input units for features+deltas.

Training set is randomized half of 1996 set i.e. 37h total (same for all nets).

Test set is 32 minute subset of hub4 97 eval.

Noway pruning set to beam 2.0 state_beam 4.0 nhyps 7 prob_min 0.00020
The dimensions varied are:

Rasta filtering (rasta) vs. plain plp features (plp)

Perutterance feature normalization (norm) vs. no additional normalization

System used alone (alone) vs. system combined with Gary's best LNA system by averaging the log likelihoods (RNN combo)

Features alone used (WERR%) vs. features+deltas used (WERR%+deltas). Note that using deltas almost doubles the number of parameters in the net, but the training set size is kept the same.
Dan Ellis <dpwe@icsi.berkeley.edu>
Modified: $Date: 1998/06/22 06:16:02 $