ICSI Speech FAQ:
3.1 I found this file. What is it?
Answer by: dpwe - 2000-07-27
Below are a few ideas for deducing the content of a random file.
When all else fails, take a look at the bytes in the file, e.g.
hexdump <file> | head
.. although in general it will take a certain amount of experience
before this information is terribly useful.
- Filename:
If you're lucky, the creator of the file will have observed
file naming conventions (and used them truthfully); at the
very least, the name should be a clue. See the
section on file naming conventions in
David Johnson's drspeech.txt for an extensive list of
preferred extensions within the speech group.
- The "file" command:
Unix provides a command called
file which inspects the first few bytes of a file's content
and attempts to classify it based on some rules. Unfortunately
it doesn't know about the more special-purpose file types associated
with speech recognition, but it's good for distinguishing
executables, scripts, etc. It will also tell you which architecture
(SPARC, i386) a binary is compiled for.
- Sound files:
If you think your file might be a sound waveform, try
sndcat -q -v <file> .
sndcat will attempt to identify the
header of the common soundfile formats. If it reports that the
file type is PCM
(in parens after the file name), it means that it
failed to identify it, and is treating it as raw binary. However,
if you think the file might be in ESPS/xwaves format, try
sndcat -q -v -S ESPS <file> i.e. explicitly
forcing the file to be treated as that type, since these headers
cannot be easily spotted. If the file is not ESPS, this will
result in an error message.
- Feature/probability files:
If you think your file might be a feature file, you can try
feacat -q -ipf <type> <file>
where <type> is one of the codes known by feacat
such as pfile, lna, rapbin. If the type matches the file, feacat
will read through the file and report the total number of features,
frames and utterances, otherwise it will report an error. For
headerless Cambridge formats such as lna and pre, you have to
tell feacat also what you think the feature vector size
is with the -width nn option.
For lna files, there is a script
lnainfo that attempts to guess the feature width by searching over
a range of values.
- Labels files:
labcat -q -ipf <type> <file>
works in much the same manner as feacat above, except for
labels files. <type> can be pfile, ilab, ascii etc.
- Grammar (language model) files:
Grammars for recognition come in a variety of weird binary formats
with various restrictions on which programs (compiled with which
options) can use them. I don't know the full answer to this one,
but the noway grammar formats indicate their type with their first
three bytes e.g. "NG3" for an ngram-format file containing trigrams.
Previous: 2.13 How do I use run-command to run programs on idle machines? - Next: 3.2 What are the wavfile data formats, and how can I manipulate wavfiles?
Back to ICSI Speech FAQ index
Generated by build-faq-index on Tue Mar 24 16:18:14 PDT 2009