Lexicalized transducers are String Transducers whose state is explicitly augmented by some low-dimensional context variables.
- In the context-dependent transducer GSIMULATOR by Avinash Varadarajan, emission and transition probabilities depend on the last N recently absorbed and/or emitted symbols. This is a special case of lexicalization where the context variable is unambiguously determined by the input and output sequences.
- Adding a latent state variable to the alphabet, as in Holmes & Rubin: An expectation maximization algorithm for training hidden substitution models. J. Mol. Biol. 2002;317:753-64., can also be viewed as lexicalization. This sort of model can in principle remain context-insensitive, so it will not necessarily break the column-independence assumption of Felsenstein Wildcards. The latent state may not, however, be uniquely determined by the sequence in this case.
-- Ian Holmes - 23 Apr 2008