Analyzing sound mixtures
·
Real sound = many sources (speech, ...)
How to analyze / decompose?
·
The CASA approach
- bottom-up cues
- assemble into larger structures
·
ASR approaches
- train on mixed (average static noise)
- parallel models (decomposition)
but:
- calculating joint probabilities
- relative levels (cepstral domain)
- suitability of state model for nonspeech?
·
What do people do?
- ?: hypotheses pruned by gen-purp bottom-up
- i.e. a combination...
Exploiting ASR in CASA
- DAn Ellis
- 1997may21
- 2/9