https://biowiki.org/wiki/index.php?title=Gotoh_Pair_HMM&feed=atom&action=historyGotoh Pair HMM - Revision history2022-10-06T16:49:14ZRevision history for this page on the wikiMediaWiki 1.28.0https://biowiki.org/wiki/index.php?title=Gotoh_Pair_HMM&diff=12052&oldid=prevMove page script: Move page script moved page GotohPairHMM to Gotoh Pair HMM: Rename from TWiki to MediaWiki style2017-01-02T06:42:35Z<p>Move page script moved page <a href="/wiki/index.php/GotohPairHMM" class="mw-redirect" title="GotohPairHMM">GotohPairHMM</a> to <a href="/wiki/index.php/Gotoh_Pair_HMM" title="Gotoh Pair HMM">Gotoh Pair HMM</a>: Rename from TWiki to MediaWiki style</p>
<table class="diff diff-contentalign-left" data-mw="interface">
<tr style='vertical-align: top;' lang='en'>
<td colspan='1' style="background-color: white; color:black; text-align: center;">← Older revision</td>
<td colspan='1' style="background-color: white; color:black; text-align: center;">Revision as of 06:42, 2 January 2017</td>
</tr><tr><td colspan='2' style='text-align: center;' lang='en'><div class="mw-diff-empty">(No difference)</div>
</td></tr></table>Move page scripthttps://biowiki.org/wiki/index.php?title=Gotoh_Pair_HMM&diff=10293&oldid=prevIan Holmes: Imported from TWiki2008-01-21T06:20:40Z<p>Imported from TWiki</p>
<p><b>New page</b></p><div>Serial composition of [[Singlet Transducer]] and [[Gotoh Transducer]]:<br />
<br />
<graphviz><br />
digraph G {<br />
<br />
SS [shape=doublecircle, color=red];<br />
EE [shape=doublecircle, color=red];<br />
<br />
SI [shape=house, label="SI / II", color=red];<br />
IM [shape=rect, color=red];<br />
ID [shape=invhouse, color=red];<br />
<br />
SS->SI [label=a];<br />
SS->IM [label=gb];<br />
SS->ID [label=gc];<br />
SS->EE [label="h(b+c)"];<br />
<br />
SI->SI [label=v];<br />
SI->IM [label=gw];<br />
SI->ID [label=gx];<br />
SI->EE [label="h(w+x)"];<br />
<br />
IM->SI [label=d];<br />
IM->IM [label=ge];<br />
IM->ID [label=gf];<br />
IM->EE [label="h(e+f)"];<br />
<br />
ID->SI [label=p];<br />
ID->IM [label=gq];<br />
ID->ID [label=gr];<br />
ID->EE [label="h(q+r)"];<br />
<br />
label="Pair HMM version of Gotoh machine";<br />
}<br />
</graphviz><br />
<br />
Notationally we can write this composition as <math>\stackrel{\infty}{\rightarrow} a \stackrel{\Delta T}{\rightarrow} b</math><br />
<br />
Notes:<br />
# SI and II are identical states;<br />
# WW and WX are identical null states preceding EE, that are trivially eliminated.<br />
<br />
If the S and M states in the [[Gotoh Transducer]] have identical outgoing transition weights,<br />
then the same is true of SS and MM in the above Pair HMM.<br />
<br />
There are several options for I/D symmetry. For example:<br />
# Separable, identical distributions of gap lengths:<br />
** p = 0 (no ID->IM, keeps distributions separable);<br />
** v = g * r (II->II and ID->ID);<br />
** d = f/(e + f) (IM->II and IM->ID);<br />
** d = x/(w + x) (IM->II and II->ID).<br />
# Perfect exchangeability between I and D states:<br />
** d = g * f (IM->II and IM->ID);<br />
** v = g * r (II->II and ID->ID);<br />
** p = g * x (ID->II and II->ID).<br />
# Exchangeability between I and D states when order of I's and D's is summed out:<br />
** d * w = g * f * q (IM->II->IM and IM->ID->IM);<br />
** v = g * r (II->II and ID->ID).<br />
<br />
Note that these three options carry successively weaker assumptions about the order of insertions & deletions.<br />
In all cases, the joint distribution over the total number of I's and D's is symmetric,<br />
so that e.g. P(IDD)+P(DID)+P(DDI)=P(DII)+P(IDI)+P(IID).<br />
Option #2 implies that individual terms in this equation will cancel, e.g. P(IDD)=P(DII) and P(IDI)=P(DID).<br />
Option #1 implies that gaps can only appear in one rigid order (I's before D's), so that P(IDD)=P(IID) and all other terms are zero.<br />
<br />
In addition to the above constraints, there are the constraints inherent to the [[Gotoh Transducer]]:<br />
* Probabilistic normalization:<br />
** a + b + c = 1,<br />
** d + e + f = 1,<br />
** p + q + r = 1,<br />
** v + w + x = 1.<br />
* S/M symmetry:<br />
** a = d;<br />
** b = e;<br />
** c = f.<br />
<br />
This makes 14 parameters and either 11 constraints (for separable gap lengths), 10 constraints (for perfect I/D exchangeability) or 9 constraints (for order-independent exchangeability).<br />
<br />
This leaves 3 free parameters for separable gaps (for example g, d & v), 4 parameters for perfect I/D exchangeability (for example g, d, v & p) and 5 parameters for order-dependent exchangeability (g, d, v, p, x).<br />
<br />
The I/D symmetry constraints amount to a form of detailed balance.<br />
The condition of being initially at equilibrium imposes a further constraint.<br />
This can be seen e.g. by eliminating the X-tape (i.e. the ID state) and comparing the resultant marginalized Y-emitter with the [[Singlet Transducer]].<br />
<br />
For example, assuming separable gap lengths:<br />
<br />
<graphviz><br />
digraph G {<br />
<br />
SS [shape=doublecircle, color=red];<br />
EE [shape=doublecircle, color=red];<br />
<br />
SI [shape=house, label="SI / II", color=red];<br />
IM [shape=house, color=red];<br />
<br />
SS->SI [label="a'"];<br />
SS->IM [label="gb'"];<br />
SS->EE [label="h(b+c')"];<br />
<br />
SI->SI [label="v'"];<br />
SI->IM [label="gw'"];<br />
SI->EE [label="h(w+x')"];<br />
<br />
IM->SI [label="d'"];<br />
IM->IM [label="ge'"];<br />
IM->EE [label="h(e+f')"];<br />
<br />
label="Gotoh Pair HMM with input tape eliminated";<br />
}<br />
</graphviz><br />
<br />
Here<br />
* a' = a + gcp/(1-gr)<br />
* b' = b + gcq/(1-gr)<br />
* c' = c + gc(q+r)/(1-gr)<br />
* v' = v + gxp/(1-gr)<br />
* w' = w + gxq/(1-gr)<br />
* x' = x + gx(q+r)/(1-gr)<br />
* d' = d + gfp/(1-gr)<br />
* e' = e + gfq/(1-gr)<br />
* f' = f + gf(q+r)/(1-gr)<br />
<br />
Assuming S/M symmetry, this Y-emitter is guaranteed to generate the same geometric sequence length distribution as the [[Singlet Transducer]] if the IM->EE and II->EE transitions both have probability h.<br />
<span style="color: #ff0000"><br />
<small><br />
(Are these the ONLY conditions under which it's a geometric distribution?)<br />
</small><br />
</span><br />
That is, if:<br />
* e + f' = 1 (IM->EE);<br />
* w + x' = 1 (II->EE).<br />
<br />
Substituting & rearranging, these become:<br />
* d = gfq/(1-gr)<br />
* v = gfx/(1-gr)<br />
<br />
Consider the reversible separable-gap length model, which has three free parameters (g, d, v).<br />
These two constraints would seem to leave only one free parameter.... which is paradoxical.<br />
(I think you expect three: e.g. the equilibrium sequence length distribution parameter (g), the gap opening probability (d) and the gap length distribution (v)).<br />
So, it appears that it isn't possible to have a transducer of this form that keeps insertions separate from deletions, is reversible '''and''' starts at equilibrium.<br />
<br />
Now consider perfect I/D exchangeability (which mingles I's with D's, but is still reversible).<br />
You start with four free parameters (g, d, v, p).<br />
The two constraints reduce the parameter set to two free parameters (which I think can be g & d).<br />
This seems almost plausible, but it's still a little weird that you don't have a third parameter for the gap length distribution.<br />
<br />
The weakest set of constraints is found when you have exchangeability between I and D states when the order of I's and D's is summed out.<br />
The five free parameters (g, d, v, p, x) are reduced to three (g, d, v) by the initial-equilibrium constraints.<br />
This seems to be the only scheme that allows reversibility, initial equilibrium & a free choice of gap open, gap extension and equilibrium length parameters.<br />
<br />
(An assumption in the above reasoning is that the ONLY way the marginalized Y-emitter generates a geometric distribution is when IM->EE and II->EE both have probability h...? I'm pretty sure this is true though... in that anything else will be a nontrivial sum of geometric distributions...)<br />
<br />
-- [[Ian Holmes]] - 25 Feb 2007</div>Ian Holmes