Klatt Audio Scribe Notes for EE225d
Note: these files are large (150-200 KB apiece), the whole thing
is 6 MB. They are compressed with gzip.
Part A: Development of speech synthesizers
The VODER of Homer Dudley, 1939.
The Pattern Playback designed by Franklin Cooper, 1951.
PAT, the "Parametric Articficial Talker" of Walter Lawrence, 1953.
The "OVE" cascade formant synthesizer of Gunnar Fant
Copying a natural sentence using Walter Lawerence's PAT formant synthesizer, 1962.
Copying the same sentence using the second generation of Gunar Fant's OVE cascade formant synthesizer, 1962.
Comparison of synthesis and a natural sentence, using OVE II, by John Holmes, 1961
Comparison of synthesis and a natural sentence, John Holmes using his parallel formant synthesizer, 1973.
Attempting to scale the DECtalk male voice to make it sound female.
Comparison of synthesis and a natural sentence, fremale voice, Dennis Klatt, 1986b,
The DAVO articulatory synthesizer developed by George Rosen at M.I.T., 1958.
Sentences produced by an articulatory model, James Flanagan and Kenzo Ishizaka, 1976
Linear-prediction analysis and resynthesis of speech at a low-bit rate in the Texas Instruments Speek'n'Spell toy, Richard Wiggins, 1980.
Comparison of synthesis and a natural recording, automatic analysis-resynthesis using multipults linear prediction, Bishnu Atal, 1982.
Part B: Segmental synthesis by rule
Creation of a sentence from rules in the head of Pierre Delattre, using the Haskins Pattern Playback, 1959.
Output from the first computer-based phonemic-synthesis-by-rule program, created by John Kelly and Louis Gerstman, 1961.
Elegant rule program for British English by John Holmes, Ignatius Mattingly, and John Shearme, 1964.
Formant synthesis using diphone concatenation, by Rex Dixon and David Maxey, 1968.
Rules to control a low-dimensionality articulatory model, by Cecil Coker, 1968.
Part C: Synthesis by rule of segments and sentence prosody
First prosodic synthesis by rule, by Ignatius Mattingly, 1968.
Sentence-level phonology incorporated in rules by Dennis Klatt, 1976.
Concatenation of linear-prediction diphones, by Joe Olive, 1977.
Concatenation of linear-prediction demisylables by Catherine Browman, 1980.
Part D: Fully Automatic text-to-speech conversion
The first full text-to-speech system, done in Japan by Noriko Umeda et al., 1968.
The first Bell Laboraatories text-to-speech system by Cecil Coker, Noriko Umeda, and Catherine Browman, 1973.
The Haskins Laboratories text-to-speech system, 1973.
The Kurzweil reading machine for the blind, Raymond Kurzweil, 1976.
The inexpensive Votrax Type-n-Talk system, by Richard Gagnon, 1978.
The Echo low-cost diphone concatenation system, about 1982.
The M.I.T. MITalk system by Jonathan Allen, Sheri Hunnicut, and Dennis Klatt, 1979.
The multi-language Infovox system, by Rolf Carlson, Bjorn Granstrom, and Sheri Hunnicut, 1982.
The Speech Plus Inc. "Prose-2000" commercial system, 1982.
The Klattalk system by Dennis Klatt of M.I.T. which formed the basis for Digital Equiptment Corporation's DECtalk commercial systenm 1983.
The AT&T Bell Laboratories text-to-speech system, 1985.
Several of the DECtalk voices.
DECtalk speaking at about 300 words/minute.