click on the Biowiki logo to go to homepage
Edit Raw Print
Links Diffs RSS
About Stats Recent
Research Teaching | Blog
Main | JBrowse | TWiki
Biowiki > Main > TWiki Users > Ian Holmes > String Transducers > LongIndelModel

Search

Advanced search...

Topics

PageRank Checker

The "Long Indel Model" introduces realistic indel length distributions to the TKF model.

It does so at the expense of expanding the finite state transducer into a conditional generalized Pair HMM with arbitrary length distributions for indels, which effectively has a number of states bounded by the product of the sequence lengths.

Finite-state transducer approximations to this model exist (Gotoh Transducer, Knudsen Miyamoto Transducer). A related approximation is the TKF92 model which breaks a sequence into indivisible fragments. (The transducer approximations assume that the branch length is short enough to neglect overlapping indels, whereas the TKF92 model assumes that overlapping indels never occur.)

In fact, it is known that any "concave" gap penalty (i.e. monotonically decreasing gap length distribution) can be well-approximated by a finite state machine, at least for Viterbi alignment:

-- IanHolmes - 23 Apr 2008

  • cited by at least one person as their "favourite paper ever"
Actions: Edit | Attach | New | Ref-By | Printable view | Raw view | Normal view | See diffs | Help | More...