Tags:
create new tag
, view all tags
DartSrc:util/score.h

> When working in some space which is X*log(Probability)
> (x I guess 20 or 50) which is common one can get overflow
> or underflow issues...

X is 1000 for HMMER, SAM, DART & possibly others.

>     Underflow => v. small probabilities, just truncate
> them
>
>     Overflow => keep an additional integers on the
> matrix which you have subtracted off all numbers
> when one number hits the limit.

Sounds right ballpark. I ran into another kind of overflow problem with
Handel when the scores get very big and negative for large trees (too big
to hold in 31 bits). Haven't heard of that overflow technique but it could
work. The technique of adding a constant score per column is the one e.g.
Anders was using.

> For log-sum functions, one tabulates the
> results of P_1 + P_2 (can save half the
> space cause you can choose the smaller one)
> and then linear interpolate between the results
> when it is not bang on.

you want to evaluate S = log(P_1+P_2) = log(exp(S_1) + exp(S_2))

  where S_1 and S_2 are log likelihood scores

so... S = log(exp(S_1) + exp(S_2))
        = log(exp(S_1) * (1 + exp(S_2 - S_1)))
        = S_1 + log (1 + exp(S_2 - S_1))
        = S_1 + f(S_2-S_1)

where f(x) = log(1+exp(x)). You choose things so that x is negative. If x
is very hugely negative (e.g. -10000000) then f(x) ~= 0, so you only need
to tabulate up to some maximum (negative) x.

IanHolmes discusses LogArithmetic with EwanBirney, 7 March 2005

Topic revision: r1 - 2005-03-08 - TWikiGuest
 

This site is powered by the TWiki collaboration platformCopyright © 2008-2014 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
TWiki Appliance - Powered by TurnKey Linux