While several reports have shown the general effectiveness of HMMs [Krogh et al., 1994a, Brown et al., 1993, Baldi et al., 1994, Eddy et al., 1995, Eddy, 1995], this section takes a close look at effectiveness of each extension to the basic method.
We choose the globin family for the first of these illustrative
experiments because of our previous familiarity with the family. From
a set of 624 globins, close homologues were removed using a maximum
entropy weighting scheme [Krogh & Mitchison, 1995] by removing all
sequences with a very small weight (
), which left us with
167 globins. For the experiments, our group of 167 sequences was
randomly divided into a training set of 50 sequences and a test set of
117 sequences, except in the experiments on training set size.
The statistical goodness of an HMM is tied to the final probability
result of the test set. SAM reports this as a negative-log-likelihood
(
), or NLL, score. This section considers the effects of each
of the more important extensions on NLL scores. Ideally, we would
like small NLL scores that, with multiple runs using different random
seeds, are sharply peaked. Such a peaked distribution implies that
far fewer than the thousands of runs performed in these experiments
are required to generate a good model.