PDA

View Full Version : Help! Baum-Welch/EM for HMM


Vimal
02-19-2004, 10:51 AM
Hello,

Refering to Training HMM using Baum-Welch; I tried implementing it.
The forward and backward probabilities both tend to zero after moving
in the chain for 10 steps ie I get reducing probabilities and they
become zero after a while. I have refered other paper's too but they
also do not mention anything about underflow for forward and backward
probabilities.

I am confused, is reducing forward (and backward) probabilities as we
move from at each 't' to 't+1' normal operation and is there any way
we can mitigate this.

Best regards,
Vimal

Antti-Veikko Rosti
02-19-2004, 11:07 AM
Vimal wrote:
> I am confused, is reducing forward (and backward) probabilities as we
> move from at each 't' to 't+1' normal operation and is there any way
> we can mitigate this.

Yes, it is "normal operation". If you recursively multiply probabilities
(0<=p<=1), the result tends to zero. The common solution is to use log
probabilities and arithmetics.

--
A-V.I. Rosti, PhD Student
Machine Intelligence Laboratory
Cambridge University Engineering Department

Vimal
02-19-2004, 08:06 PM
Thanks Antti-Veikko,

Is there a implementation or document available for this conversion.

I am trying to figure out technique used in HTK. Here I have a HMM
with b(o(t)) being Gaussian, which makes 'alpha and beta' go to zero.

Thanks,

Cheers,
Vimal

Antti-Veikko Rosti <[email protected]> wrote in message news:<[email protected]>...
> Vimal wrote:
> > I am confused, is reducing forward (and backward) probabilities as we
> > move from at each 't' to 't+1' normal operation and is there any way
> > we can mitigate this.
>
> Yes, it is "normal operation". If you recursively multiply probabilities
> (0<=p<=1), the result tends to zero. The common solution is to use log
> probabilities and arithmetics.

John Openshaw
02-20-2004, 08:41 AM
In article <[email protected]>, Vimal
<[email protected]> writes
>Hello,
>
>Refering to Training HMM using Baum-Welch; I tried implementing it.
>The forward and backward probabilities both tend to zero after moving
>in the chain for 10 steps ie I get reducing probabilities and they
>become zero after a while. I have refered other paper's too but they
>also do not mention anything about underflow for forward and backward
>probabilities.
>
>I am confused, is reducing forward (and backward) probabilities as we
>move from at each 't' to 't+1' normal operation and is there any way
>we can mitigate this.

No, this does happen (you are repeatedly multiplying numbers less than
one) - use log arithmetic. Underflow can be a particular problem when
starting training - how well did you seed the means/vars of the HMM?


--
John Openshaw

Antti-Veikko Rosti
02-20-2004, 10:24 AM
Vimal wrote:
> Is there a implementation or document available for this conversion.

Why don't you have a look at the HTK source code! You only need to have
a valid e-mail address to register at:
http://htk.eng.cam.ac.uk/

> I am trying to figure out technique used in HTK. Here I have a HMM
> with b(o(t)) being Gaussian, which makes 'alpha and beta' go to zero.

For a single mixture Gaussian, computing log(b(o(t))) is trivial. You
can pre-compute the constant terms: \log(|S_j|(2\pi)^n), where |S_j| is
the determinant of the covariance matrix associated with state j, and n
is the dimensionality of the observation vector o(t). The log-likelihood
is then \log(N(o(t);m_j,S_j))=-1/2(Z+(o(t)-m_j)^TS_j^{-1}(o(t)-m_j)),
where Z is the constant, m_j is the mean associated with state j and
()^T denotes matrix/vector transpose.

For a Gaussian mixture model you need to use log-add:
\log(\exp(A)+\exp(B))=A+\log(1+\exp(B-A)) assuming A>B. The second term
is worth computing only if B-A exceeds some accuracy threshold.

Modifying the rest of the forward-backward algorithm to use log
arithmetics should be trivial now.

--
A-V.I. Rosti, PhD Student
Machine Intelligence Laboratory
Cambridge University Engineering Department

John Openshaw
02-24-2004, 10:39 AM
In article <[email protected]>, Antti-Veikko Rosti
<[email protected]> writes
>Vimal wrote:
>> Is there a implementation or document available for this conversion.
>
>Why don't you have a look at the HTK source code! You only need to have
>a valid e-mail address to register at:
>http://htk.eng.cam.ac.uk/

Thats not true - you are also required to comply with the terms of the
licensing agreement. When you get older you may appreciate that it can
be the subtle things that are important.



--
John Openshaw

Vimal
02-25-2004, 07:47 PM
Thanks for all the comments.
Infact, I already complied with the HTK registration requirements ('I
agree' thing too). I was quite generous (almost same as I would expect
at the end) for mean and variance values to start with.

Hey inorder to skip the logs, i am using the normalization approach.
Basically normalizing alpha(t) at each 't' and using the same
normalization constant for backward i.e. beta(t).

It works well for my application :-)

Cheers,
Vimal

John Openshaw <[email protected]> wrote in message news:<[email protected]>...
> In article <[email protected]>, Vimal
> <[email protected]> writes
> >Hello,
> >
> >Refering to Training HMM using Baum-Welch; I tried implementing it.
> >The forward and backward probabilities both tend to zero after moving
> >in the chain for 10 steps ie I get reducing probabilities and they
> >become zero after a while. I have refered other paper's too but they
> >also do not mention anything about underflow for forward and backward
> >probabilities.
> >
> >I am confused, is reducing forward (and backward) probabilities as we
> >move from at each 't' to 't+1' normal operation and is there any way
> >we can mitigate this.
>
> No, this does happen (you are repeatedly multiplying numbers less than
> one) - use log arithmetic. Underflow can be a particular problem when
> starting training - how well did you seed the means/vars of the HMM?