# Tips For Codon Matrices

From Biowiki

# Codon matrices in xrate

(See also known issues with DART)

When using XRATE to estimate large rate matrices (e.g. 61*61 codon matrices), the following tips are recommended:

- Decrease the
parameter. This command-line parameter determines the minimum fractional increase in log-likelihood that is required for EM to continue. For models with lots of parameters, small changes in the log-likelihood can represent substantial changes in the lesser-used parameters. You may therefore want to set`--mininc`to a lower value than its default. For example`--mininc`xrate --mininc .00001 [...other arguments...]

- Increase the
parameter. This command-line parameter sets the number of "bad" iterations of EM that will be forgiven (these are iterations where the likelihood decreases, instead of increasing; this should never happen in practice - the likelihood should keep asymptotically increasing - but due to precision error, it can occasionally happen in practice, especially with these big rate matrices. I usually set "--forgive 2" for codon matrices, which means that at worst xrate will do 2 unnecessary iterations of EM)`--forgive` - Start from a uniform (flat) seed. In other words, all the initial rates and probabilities for the model should be the same. Additionally they should be under-estimates (i.e. start with rates a bit lower than the eventual values you anticipate). These both help convergence and reduce the chances of the EM algorithm getting stuck in a local maximum.

-- Ian Holmes - 30 Apr 2010