Date: Wed, 8 May 2002 09:29:07 -0700 (PDT) From: Renate and Thilo Weller and Pfau Subject: Re: presegmentation To: Panu Somervuo MIME-Version: 1.0 X-Keywords: Hi Panu, it has been a long time since I gave something directly to Jane, since the last meetings were transcribed by 'tigerfish#. In that case you have to convert the trs-file (the one without the trs-ending) into the linearized version! In order to do that you first have to run trs2list and then some of Adam's tools (makewaveseg ..., best ask him about that) to convert the multichannel output into the linearized wavefile, which then will be sent to tigerfish for transcription. When transcirption is done inhouse, first do a manual check loading the trs file (*-AfterCorrelAndPzmCorrel) into tghe multichannel transcriber to see if the quality is reasonable. If not, some parameters can be tuned: -speech and nonspeech priors in the cfg-files -correlation thresholds for the last posprocessing step, or check if the quality before correl and pzmcorrel seems to be better! Hope this helps Thilo --- Panu Somervuo wrote: > Hi Thilo, > > I run your segmenter to btr001 and btr002 data. > Could you tell what did you > usually give to Jane or transcribers (there are more > than one trs file in the > output directory). And did you do some manual > corrections or parameter tunings > afterwards (or manual checking/correction of > automatic segmentation). > All information is welcome :) > > Panu > > > =========================================================== Date: Tue, 4 Jun 2002 02:12:31 -0700 (PDT) From: Renate and Thilo Weller and Pfau Subject: Re: ICSI and SRI data To: Panu Somervuo MIME-Version: 1.0 Hi Panu, sorry again for the late response. Have been quite busy around here. --- Panu Somervuo wrote: > Hi Thilo, > > ookayy, now the question things start again. > But first: your programs have been running smoothly > so nothing to complaing > about them. > Good to hear! ;-) > Now we have data from SRI and I was told there are > no time skews in the channels > of their recordings. I run the segmenter with and > without skew compensation and > surprisingly got the same segmentations. Do you > remember did you benefit much > from the skew compensation? The quality seems to be > ok, although there are some > inserted and some deleted speech segments every now > and then, but probably there > were those also for the ICSI segmentations. Do you > have some suggestions what to > do if using the data from other site besides running > the segmenter as it is? > What exactly did you change to run the segmenter without skew??? I am asking, since several tools have to be set to run withgout time skew correction (at least the segmenter itself and the feature extractor if I remember correctly). I am not sure if there are option for the featire extractor to run it without time skew cirrection or if there was a separate version, just ignoring tine skews in the acoustic preprocessing. If i remember correctly there is a funciton 'start' in the class CAkupreSimple which can be given an optional argument which defines the skew. If you do not want to correct for the skew you just have to set this argument to zero. > Another thing is trying to develop the segmenter. I > have read your ASRU2001 > paper. How did you end up to the current approach? The problem on the one hand was, that I did not like the results with Gaussian system too much (too many insertions mostly due to crosstalk). On the other hand I did not have a good tool for training (or clustering) the Gaussians in the case of mixture models and the ANN tools are quite simple to configure. > Hmm, maybe this is too broad > question, but I ask this because I would not like to > repeat all the stuff you > have already done. On the other hand, it can be > difficult to just continue from > the level where the segmenter is now without knowing > its history. Well, let's > see. > Let me know, if you have more questions!!! > Best, > Panu > Hear from you soon. Thilo =========================================================== Date: Wed, 5 Jun 2002 06:48:15 -0700 (PDT) From: Renate and Thilo Weller and Pfau Subject: Re: ICSI and SRI data To: Panu Somervuo MIME-Version: 1.0 Hi Panu, --- Panu Somervuo wrote: > >Hi Panu, > > > >sorry again for the late response. Have been quite > >busy around here. > > > > Hi, > > I appreciate if you have time to answer. But of > course I'm aware that you have > your own things to do and really it's not your job > to be my teacher all the > time. Anyway, thanks for replying. > Don't worry, it really is no hassle at all, and I appreciate your efforts to get the segmentation running again! > >What exactly did you change to run the segmenter > >without skew??? I am asking, since several tools > have > >to be set to run withgout time skew correction (at > >least the segmenter itself and the feature > extractor > >if I remember correctly). I am not sure if there > are > >option for the featire extractor to run it without > >time skew cirrection or if there was a separate > >version, just ignoring tine skews in the acoustic > >preprocessing. If i remember correctly there is a > >funciton 'start' in the class CAkupreSimple which > can > >be given an optional argument which defines the > skew. > >If you do not want to correct for the skew you just > >have to set this argument to zero. > > > 1) For CreateFeatAndLab in the .cfg file I set > Tools.CorrectForSkew = No > I also compiled the program again and checked > that this flag has an effect. > (for some reason I felt that without > recompilation that flag had no effect, > at least the program outputted the skews which were > not zero) > > 2) For correlations, I used > calcmaxcorreltopzm-noskew > ...then qnsfwd to get apost-files > > 3) For sns-detector (spdtest...), I explicitly set > the ByteSkips of all > channels to 0 and recompiled the program. > > What I compared was the results of the sns-detector > with and without skew > compensation. Since steps 1) and 2) were the same > for the two experiments, this > means that the apost files were the same for the > both runs. Oh, I understand. You should have created new aposterioris abd used those! > But since the sns-detector will add the skews (if > not explicitly set to zero) > I thought there would be some difference between the > performances. > The only thing the segmenter needs to know if there are skews or there arent't is for the (optional) thresholding step whoch not only uses the correlations between close talking and pzm channels but also correlations between different close talking channels. > Maybe I should confirm the very fundamental thing: > So sns-detector reads the pfiles (feature files), > not anymore the waveform > files? It does read the waveform, just to calculate correlations!!!! The hybrid system only uses the aposteriors which are stored in the pfiles!!! So even if you can define the ByteSkips, > does it have any effect > anymore? Or does the sns-detector do the entire > feature computation all over > again (or just the normalization for the existing > pfiles I thought). I guess the > latter, but I just want to confirm. I probably > should read the code line by line > to really know what's going on. > > Also very fundamental question: > After the qnsfwd recognition stage, is the role of > the HMM-based sns-detector > more to eliminate the false alarms than to add new > segments? It apparently also > finetunes the existing segment borders. It does so by using the priors, which are defined in the cfg files. > There are also some parameters for breath-detection, > is this more like > finetuning or really essential part of the > segmentation? This is just finetuning and I am not sure if it has any effect in the currtent version any more, sorry. > I thought it could be > possible to traing a separate breath-model (but most > likely, what you have done, > does effectively the same). > What I did, is just to find temporal patterns which look like consecutive breaths. I guess what would be better is to use additional features to characterize breaths better?! > >> > >Let me know, if you have more questions!!! > > I feel guilty already now asking too many questions > :) > But do you know, are you going to write some kind of > tech. report of all your > stuff in some near or far future? > I should and I will! > Despite my mails, enjoy the summer! > > Panu > > Thilo =========================================================== Date: Wed, 5 Jun 2002 06:51:09 -0700 (PDT) From: Renate and Thilo Weller and Pfau Subject: Re: ICSI and SRI data To: Panu Somervuo MIME-Version: 1.0 --- Panu Somervuo wrote: > > I wrote: > > Maybe I should confirm the very fundamental thing: > > So sns-detector reads the pfiles (feature files), > not anymore the waveform > > files? So even if you can define the ByteSkips, > does it have any effect > > anymore? Or does the sns-detector do the entire > feature computation all over > > again (or just the normalization for the existing > pfiles I thought). I guess > > Ok, reading from the code it looks that it computes > the features all over > again... > It does, but it only uses the correlations if I remember correctly. In principle, all the feature normaliyation stuff and related things can be thrown out and you can use the feature of the pfiles, which are already normalized. In fact this would save time, since the feature extraction can be quite lenghty if done remotely (not on the machine on which the wav files are stored) > -Panu > > Thilo ===========================================================