Multi-Layer Perceptrons for Speech Recognition (Intel/ParLab, 2008)
Historical note: It's a "neural network", which was considered taboo.
Beyond WER: How to Evaluate Speech Technologies (SpeechTek, 2018)
tl;dr: There's a lot of problems with the industry-standard benchmark.
M-Y. Hwang, G. Peng, M. Ostendorf, W. Wang, A. Faria, and A. Heidel
Building A Highly Accurate Mandarin Speech Recognizer with Language-Independent Technologies and Language-Dependent Modules
IEEE Transactions on Audio, Speech, and Language Processing.
2009.
D. Vergyri, A. Mandal, W. Wang, A. Stolcke, J. Zheng, M. Graciarena,
D. Rybach, C. Gollan, R. Schlater, K. Kirchoff, A. Faria, and N. Morgan
Development of the SRI/Nightingale Arabic ASR System
Proc. of Interspeech.
2008.
J. Chong, Y. Yi, A. Faria, S. Rajagopalan, K. Keutzer
Data-Parallel Large Vocabulary Continuous Speech Recognition on
Graphics Processors
Workshop on Emerging Applications and Many-core Architecture (EAMA).
2008.
A. Faria and N. Morgan.
Corrected Tandem Features for Acoustic Model Training.
Intl. Conf. Acoustics, Speech, Signal Processing
(ICASSP).
2008.
A. Faria and N. Morgan.
When a Mismatch Can Be Good: Large
vocabulary speech recognition trained with idealized Tandem features.
Proc. ACM Symposium on Applied Computing (SAC).
2008.
M-Y. Hwang, G. Peng, W. Wang, A. Faria, A. Heidel.
Building a Highly Accurate Mandarin Speech Recognizer
Proc. Automatic Speech Recognition and Understanding (ASRU).
2007.
S. Petrov, A. Faria, P. Michaillat, A. Berg, A. Stolcke, D. Klein, J. Malik.
Detecting Categories in News Video Using Acoustic, Speech, and Image
Features
Proc. TREC Video Retrieval Workshop (TRECVID).
2006.
A. Faria
Accent Classification for Speech Recognition
Proc. Machine Learning and Multimodal Interaction (MLMI), LNCS 3869.
2005.
A. Faria and D. Gelbart
Efficient Pitch-based Estimation of VTLN Warp Factors
Proc. Eurospeech.
2005.