ICSI Speech FAQ:
3.1 I found this file. What is it?

Answer by: dpwe - 2000-07-27

Below are a few ideas for deducing the content of a random file. When all else fails, take a look at the bytes in the file, e.g.

  hexdump <file> | head

.. although in general it will take a certain amount of experience before this information is terribly useful.

Filename: If you're lucky, the creator of the file will have observed file naming conventions (and used them truthfully); at the very least, the name should be a clue. See the section on file naming conventions in David Johnson's drspeech.txt for an extensive list of preferred extensions within the speech group.

The "file" command: Unix provides a command called file which inspects the first few bytes of a file's content and attempts to classify it based on some rules. Unfortunately it doesn't know about the more special-purpose file types associated with speech recognition, but it's good for distinguishing executables, scripts, etc. It will also tell you which architecture (SPARC, i386) a binary is compiled for.

Sound files: If you think your file might be a sound waveform, try sndcat -q -v <file> . sndcat will attempt to identify the header of the common soundfile formats. If it reports that the file type is PCM (in parens after the file name), it means that it failed to identify it, and is treating it as raw binary. However, if you think the file might be in ESPS/xwaves format, try sndcat -q -v -S ESPS <file> i.e. explicitly forcing the file to be treated as that type, since these headers cannot be easily spotted. If the file is not ESPS, this will result in an error message.

Feature/probability files: If you think your file might be a feature file, you can try feacat -q -ipf <type> <file> where <type> is one of the codes known by feacat such as pfile, lna, rapbin. If the type matches the file, feacat will read through the file and report the total number of features, frames and utterances, otherwise it will report an error. For headerless Cambridge formats such as lna and pre, you have to tell feacat also what you think the feature vector size is with the -width nn option.
For lna files, there is a script lnainfo that attempts to guess the feature width by searching over a range of values.

Labels files: labcat -q -ipf <type> <file> works in much the same manner as feacat above, except for labels files. <type> can be pfile, ilab, ascii etc.

Grammar (language model) files: Grammars for recognition come in a variety of weird binary formats with various restrictions on which programs (compiled with which options) can use them. I don't know the full answer to this one, but the noway grammar formats indicate their type with their first three bytes e.g. "NG3" for an ngram-format file containing trigrams.

Generated by build-faq-index on Tue Mar 24 16:18:14 PDT 2009