EECS 225d (3 units)
Audio Signal Processing in Humans and Machines
247 Cory, MWF 2-3
Spring Semester, 1995
Professors Morgan and Gold

The focus of the course is on engineering models for speech and music processing. These models are used to design systems for analysis, synthesis, and recognition. For each of these topics there will be an emphasis on physiological and psychoacoustic properties of the human auditory and speech generation systems, particularly from an engineer's perspective: how can we make use of knowledge about these natural systems when we design artificial ones?

Topics include: an introduction to pattern recognition; speech coding, synthesis, and recognition; models of speech and music production and perception; signal processing for speech analysis; pitch perception and auditory spectral analysis with applications to speech and music; a historical survey of speech synthesizers from the 18th century to the present; robustness to environmental variation in speech recognizers; vocoders and music synthesizers; statistical speech recognition, including introduction to Hidden Markov Model and Neural Network approaches.

Prerequisites: EE123 or equivalent, and Stat 200A or equivalent; or grad standing and consent of instructor

Required text: J.Flanagan ``Speech Analysis, Synthesis and Perception'' Springer-Verlag, New York 1972 (book is out of print; we have copied it) and course reader. Both available at Copy Central on Bancroft.

Recommended texts:

EECS 225d Tentative Schedule, Spring 1995

INTRODUCTORY MATERIAL + PRODUCTION
WEEK 1:
1. Overall Introduction - (Morgan) - Jan 18
2. Early History of Speech and Music Synthesis (Gold) Jan 20
WEEK 2:
3. Speech Analysis/Synthesis Overview (Gold) Jan 23
4. Speech Production Models I (Gold) Jan 25
5. Early History of Speech Recognition (Morgan) Jan 27
WEEK 3:
6. Speech Recognition Overview (Morgan) Jan 30
7. Speech Production Models II (Morgan) Feb 1
8. Phonemes and features (Ohala) Feb 3
WEEK 4:
9. Music Production Models I (Gold) Feb 6
10. Music Production Models II (Gold)Feb 8
11. Room Acoustics (Morgan) Feb 10

HUMAN PERCEPTION
WEEK 5:
12. Physiology of the Ear (Gold) Feb 13
13. Engineering models of the cochlea (guest: Lazzaro)Feb 15
14 Psychophysics and Modelling of Speech Perception I (Greenberg) Feb 17
(1st day of 6th week is a holiday - President's Day)
WEEK 6:
15. Engineering models of human formant tracking (Holton) Feb 22
16. Psychophysics and Modelling of Speech Perception II (Greenberg) Feb 24
WEEK 7:
17. Engineering models of psychophysics (Hermansky) Feb 27
18. Models of Pitch Perception (Gold) Mar 1
19. Musical Pitch (Gold) Mar 3
WEEK 8:
20. Midterm 1 Mar 6

MACHINE ANALYSIS AND SYNTHESIS
21. Pitch Detection of Speech and Music (Gold) Mar 8
22. How do humans process and recognize speech? (Jont Allen) Mar 10
WEEK 9:
23. Vocoders I (Pitch) (Gold) Mar 13
24. Vocoders II (Spectral Analysis) (Gold) Mar 15
25. Vocoders III (Gold) Mar 17
WEEK 10:
26. Speech Synthesis (Gold) Mar 20
27. Music Synthesis (Gold) Mar 22
28. Speech/music demos (Sid Fels) Mar 24
(Week 11 is spring break )
WEEK 12:
29. Intro to pattern recognition - Feature Extraction (Morgan) Apr 3
30. Intro to pattern recognition - pattern classification (Morgan) Apr 5
31. Intro to statistical pattern recognition (Morgan) Apr 7
WEEK 13:
32. Template matching (Morgan) Apr 10
33. Hidden Markov Models (Morgan) Apr 12
34. Neural Networks for Speech Recognition(Morgan) Apr 14

REAL SYSTEMS
WEEK 14:
35. Natural language in a speech interface -
the Berkeley Restaurant Project (Jurafsky) Apr 17
36. Text-to-speech systems (O'Malley) Apr 19
37. Speech recognition for the Census (Ron Cole) Apr 21
WEEK 15:
38. Midterm 2 Apr 24

EPILOGS
39. Using Perceptual models for speech recognition
- (Ghitza) Apr 26
40. Using Perceptual models for Analysis/Synthesis
- (Ghitza) Apr 28

THE FINISH
WEEK 16:
41-43. Student presentations May 1,3,5
WEEK 17:
44. Some Current Research Topics - (Morgan) May 8


fosler@ICSI.Berkeley.EDU
Wed Apr 19 11:24:41 PDT 1995