Audio Signal Processing in Humans and Machines

EECS 225d (3 units)
Audio Signal Processing in Humans and Machines
6th floor at ICSI, MW 4:00-5:30
Spring Semester, 2014
Professor Morgan

HOMEWORKS

Homework #1:

By Monday, Feb 3 have read chapters 2-4, send answers to Exercises 2.3 and 3.1 from the book to morgan@icsi.berkeley.edu with subject line HOMEWORK #1.

Exercises repeated here:

2.3 Compare von Kempelen's speaking machine with Dudley's Voder. (a) What are the chief differences? (b) What are the chief similarities? (c) How would you build a von Kempelen machine today?

3.1 Explain why wideband spectrograms show periodicity in time whereas narrow-band spectrograms show periodicity in frequency.

Homework #2:

By Monday, Feb 10 have read chapters 5 and 18, send answers to Exercises 4.3 and 5.3, and 18.2 from the book to morgan@icsi.berkeley.edu with subject line HOMEWORK #2.

Exercises repeated here:

4.3 Find a newspaper, magazine, or Web announcement about some speech recognition system, either commercial or academic. Can you conclude anything about the structure and capabilities of the system? If there is any content in the release information, try to associate your best guesses about the systems with any of the historical developments described in this chapter.

5.3 Describe some situations in which a five-word recognizer can accomplish a more difficult task than a 1000-word recognizer.

18.2 Lippmann points out many ways in which 1996 speech recogition technology is inferior to the capabilities of human speech recognition. Suggest some situations in which human speech recognition could potentially be worse than an artificial implementation.

Homework #3:

Please send me your 3 ranked choices for which presentations you would prefer to give (where number 1 is your first choice etc.) Please send this to me by Saturday February 15.

1. February 24: Acoustical Basics (Chapters 10-11)

2. March 3: Neural Networks for Speech (some material in chapter 27)

3. March 12: Feature Extraction for ASR 2 (Chapter 22)

4. March 17: Deterministic Sequence Recognition for ASR (Chapter 24)

5. March 19: Statistical Sequence Recognition (and Training) for ASR (Chapters 25 and 26)

6. March 31: Adaptation and Discriminant Training for ASR (Chapter 27)

7. April 7: Speaker Verification (Chapter 41)

8. April 14: Source Separation (Chapter 39)

9. April 16: Music Signal Analysis (Chapter 37)

Homework #4:

By Friday, Feb 28 send answers to Exercises 8.2, 8.3, 9.2, and 9.3 from the book to morgan@icsi.berkeley.edu with subject line HOMEWORK #4.

Exercises repeated here:

8.2 George has designed a classifier that can perfectly distinguish between basketball players and researchers for every example in the training set. Martha has designed a classifier that makes some errors in the training set classification. However, Martha insists that hers is better. How can she be right? In other words, what are some potential pitfalls in the development of a classifier given a training set?

8.3 Assuming that one is willing to use an infinite number of training examples (and take an infinite amount of time for classification), would it be possible to do perfect classification of speech utterances given all possible waveforms?

9.2 The heights and weights of basketball players and speech researchers are vector quantized so that there are 16 different possible reference indices i corresponding to (height, weight) pairs. A training set is provided that has the height and weight and occupation of 1000 people, 500 from each occupation. How would you assign them to reference indices? Given a chosen approach for this, how would you estimate the discrete densities for the likelihoods P(i | occupation)? How would you estimate the posteriors P(occupation | i)? How would you estimate the priors P(occupation) and P(i)?

9.3 A Bayes decision rule is the optimum (minimum error) strategy. Suppose that Albert uses a Bayes classifier and estimates the probabilities by using a Gaussian parametric form for the estimator. Beth is working on the same problem and claims to get a better result without even using explicitly statistical methods. Can you think of any reason that Beth could be right?

Maintained by:
N. Morgan
morgan@ICSI.Berkeley.EDU
$Date: 2014/1/19 $