This page is a central location to gather and organize
the results of the various forays we are making into the Hub 4  Broadcast
News task here at ICSI. As our experiments continue, it will hopefully
form a kind of narrative of our approach to the problem.
1998aug03: Progress of the current trainings
1  2  3  4  5  6  7  8  9  10  11  12  13 
49.7  45.8  44.6  43.2  42.4  39.4  37.7  36.8  36.6  35.3  35.1  35.2  35.0 
1  2  3  4  5  6  7  8  9  10  11  12  13 
51.9  50.5  48.5  44.5  42.2  39.9  39.2  39.4  38.8  38.8  38.6  38.5  38.5 
No Deltas /
No Deltas Normalized 
Delta Window 9 /
Delta Window 9 Normalized 
Delta Window 5 /
Delta Window 5 Normalized 

Context Window 9  39.8 / 36.7  40.0  37.7 / 35.0 
Context Window 5  44.5  40.5  39.3 / 35.7 
No Deltas /
No Deltas Normalized 
Delta Window 9 /
Delta Window 9 Normalized 
Delta Window 5 /
Delta Window 5 Normalized 

Context Window 9  30.8 / 31.1  31.2  30.5 
Context Window 5  32.2  30.8  30.9 / 30.2 
No Deltas /
No Deltas Normalized 
Delta Window 9 /
Delta Window 9 Normalized 
Delta Window 5 /
Delta Window 5 Normalized 

Context Window 9  33.9  
Context Window 5  34.7  34.0 / 34.7 
No Deltas /
No Deltas Normalized 
Delta Window 9 /
Delta Window 9 Normalized 
Delta Window 5 /
Delta Window 5 Normalized 

Context Window 9  29.6  
Context Window 5  29.8  29.4 / 29.9 
System/Spoke

all

F0

F1

F2

F3

F4

F5

Fx


% test set

100.0

37.8

19.6

14.9

5.1

12.6

2.2

7.7

Features  # Hidden Units  # epochs  %err alone  %err w/RNN 
PLP12N, 7hyp  2000  8  36.7  31.1 
"  2000  13  35.0  30.5 
"  4000  8  36.0  31.0 
"  4000  10  35.0  30.4 
"  4000  13  34.4  30.1 
msg0+2N8khalf epoch 8  Hidden Bias (mean 6.5684)  Output Bias (mean 3.758) 
plp12N8khalf merged with msg0+2N8khalf  Hidden Bias (mean 8.1153)  Output Bias (mean 3.5811) 
plp12+d5_cw5  Hidden Bias (mean 6.7564)  Output Bias (mean 3.6158) 
plp12N  Hidden Bias (mean 6.6209)  Output Bias (mean 3.6358) 
The subbands are labelled "a", "b", "c", and "d", and consist of the
following features:
Band  Top Half Features  Bottom Half Features 
A  0,1,2,3,4  14,15,16,17,18 
B  5,6,7,8  19,20,21,22 
C  9,10,11  23,24,25 
D  11,12,13  25,26,27 
Bands  WER 
abcd  38.7 
abcdrnn*  30.4 
abcd  64.9 
abcdabcdx4  40.5 
abcdabcdx4rnnx8  31.8 
a+b+c+d  48.6 
a+b+c+dabcdx4  38.5 
a+b+c+dabcdx4rnnx8  30.7 
abcabcd  38.7 
abcabdacdbcd  37.9 
abcabdacdbcdabcdx4  37.7 
abcabdacdbcdabcdx4rnnx8  30.2 
cep abcd  Out of memory on teq 
klt abcd  62.7 
cep abcdabcdx4  40.2 
klt abcdabcdx4  40.2 
cep abcdabcdx4rnnx8  31.6 
klt abcdabcdx4rnnx8  31.4 
klt a+b+c+d  47.6 
klt a+b+c+d no norm  54.7 
klt a+b+c+dabcd  38.7 
klt a+b+c+dabcdx4  38.2 
klt a+b+c+dabcdx4 no norm  37.7 
klt a+b+c+dabcdx4rnnx8  30.3 
klt a+b+c+dabcdx4rnnx8 no norm  30.5 
From these results, single multiband hurts overall recognition, even when combined using an MLP. The "drop one" multiband doesn't help enough to be worth the extra complexity (although I haven't tried combining the "drop one" multiband using an MLP).
See the bnspokes database for a breakdown of the focus conditions.