DAn Ellis's ICSI homepage

Until August 2000, I was a senior research scientist working on speech and audio processing and recognition within the Realization group at the International Computer Science Institute in Berkeley, California. This is basically a frozen version of the web site from that phase of my career, though there are certain parts that I am keeping up-to-date (such as my talks and publications lists).

I am now an assistant professor at Columbia University in the City of New York. For more details, see my Columbia website.

You can also see my even older web site, with some personal and research-related information, at the M.I.T. Media Lab. It hasn't been updated since 1996apr.

My Research tree

.. wherein may be found several pages relating to my research. Again, I apologize for the absence of an overview or roadmap

My bookmarks file.

Recent stuff

I finally created a web page to describe the SPRACHcore connectionist speech recognition software release.
Put up a page of my notes on some papers at ICASSP-2000 complete with links to the ICSI-local online copy of the proceedings CD. Also put up the PDF of my talk on Tandem Acoustic Modeling, as given at Sheffield last week, and re-used for the Realization lunch this week.
My ICASSP 2000 poster on Tandem connectionist modeling is now up.
I have started to collect the results from my experiments with this year's Aurora noisy digits task in the Aurora 2000 page.
I gave a talk at the AVIOS conference in San Jose on better recognizing through combination - a pretty high-level survey of combination strategies in recognition.
Also added a page on the recording control software for the Meeting Recorder project.
Created a page describing the Videoconferencing Room Audio Setup being used for data collection in the Meeting Recorder project. I added this within my directory of Meeting Recorder Project information.
Added slides for my latest talk, Sound content-based analysis which was an update to the members of the group on what I'm thinking about.
Created a page describing the use of cross correlation for speaker tracking with the new meeting recorder data we are collecting.
Added the slides from my quick overview talk on speech interfaces that I gave at a recent mini-workshop on campus.
Added the slide pack describing my most recent European project meetings to the realization group lunch.
Finally finished a page sketching the tortuous path by which the original training targets for the AURORA noisy digits task were created.
Added my slide pack from the ICSI lunch talk reviewing my European tour.
I've started assembling some information on my new project, General Audio Mixture Analysis and Retrieval, which is concerned with using information retrieval, computational auditory scene analysis and machine learning to provide content-based access to real audio, particularly that which does not contain speech. Currently, there's just the PDF of the project summary and description that I submitted to the NSF as part of a grant application.
My European trip continued with a poster on speech/music discrimination at Eurospeech-99, then talks at meetings of the RESPITE and THISL projects, both held in Switzerland.
I have been in Finland visiting the Tampere University of Technology. I gave two talks, one on CASA and one on speech recognition, both available in Acrobat PDF from my talks page.
New details continue to appear on the AURORA page, including a comparison of the statistics of MSG and MFCC features, which would appear to account for the poor performance of the HTK/MSG system.
I've started a page describing my experiments with the AURORA noisy digits task as part of the RESPITE project.
I reproduced some results presented to us by Hynek Hermansky of showing that the mean spectral features around a given label have significant structure out to several hundred milliseconds i.e. far beyond the duration of that single phone.
The ThislIR GUI is a speech-input spoken-document-retrieval system that we are developing in conjunction with several European partners. I've been working on a Tcl/Tk front-end, as described in this page.
I did an investigation of the temporal structure of modulation-filtered spectrogram features just to see what these excellent features of Brian's really look like.
I'm trying to finish a detailed description of the Spam+AntiSpam (a/k/a spamnotspam) project I worked on a little before the ICSLP submission deadline. You can see the beginning of it at The spamnotspam report.
I've copied a bunch of proceedings CD-ROMs onto my scratch disk (for now); you can find them at the Realization Group Online Proceedings Page.
I have some pages relating to the ICSI Broadcast News (HUB4) effort: Results summary table, Feature comparison matrix, and training and decode machine usage.
Put up the slides from my Realization Group Meeting talk last week on Visualization tools. Also, added a 2nd image to my (private) RESPITE site
Added the slides from my talk on automatic audio content analysis at the recent MPEG-7 symposium in San Jose.
Put up the companion page for my submission to the Speech Communications special issue on CASA which includes client-side image maps to play the different sound examples.
Added improved figures and some new placeholder text to my introduction to the ICSI speech recognition software.
Put together a bunch of hints and tips relating to the Mohonk workshop from myself and other committee members. It includes a bunch of general notes on publishing electronic documents via PostScript and PDF.
Made a page for my older but possibly more useful pfile viewer pfview. Also, put up the page for my ICASSP'98 submission
Started a page of noise-excited speech examples to illustrate the use of my surfsynth tool, as well as the insight it can provide about what's really happening in our feature sets.
Added links to the slides of my Mohonk WASPAA talk and Haskins/NUWC talk from my recent trip out east to the talks page.
The beginnings of my introduction to the ICSI speech recognition software. This is so vestigial that I feel ashamed even putting it on the site, but maybe the shame will encourage me to work on it.
A page describing the Speech recognition visualization front-end I recently constructed as a demo of our work, but which ended up looking like it might make a useful research diagnostic tool.
A page documenting the ICSI AppleTalk network.
My package of sound file utilities, dpwelib/sndutils, now has its own home page.

Click here to go to a page of things available only from within ICSI.

My web-based browser for speech files generates spectrograms on the fly using a Tcl version of Boutell's "gd" library. See it live in action here.

The home page for the 1997 IEEE Mohonk Workshop on Applications of Signal Processing to Acoustics and Audio. I'm on the committee and am responsible for the web pages.

There's a fun web-site introducing sinewave speech here at Haskins Labs. I put together some matlab code to allow you to play with the data they provide - you can download it here.

STP: I have a personal STP homepage, including my notes on how to verify the phoneme labels produced by the transcribers working on this project. There also will you find the latest version of the bestiary.

I've produced an HTML filter for stripping annoying ad banners from some common web index sites. Read about it here.

My demonstration of on-line soundfile submission and processing (adding reverb in this case) runs on MONTOYA 

The sound-effects CD-ROM server runs on MONTOYA where the CD-ROM drive and SGI binaries are.

Some pages of 3rd party information that I have copied onto my server to be accessible even when the net is down (surely never!).

I have made a spreadsheet of all the disks I could find around the realization group.

Here's my Change of address notification (print your own version from the postscript here, preferably on a modern laser printer like zp2).

Updated: $Date: 2000/09/19 19:35:50 $
DAn Ellis <dpwe@icsi.berkeley.edu>
International Computer Science Institute, Berkeley CA
(510) 666-2940 fax: (510) 666-2956
Version: 2.6