Wednesday, February 20, 2013

Fundamental questions of Hidden Markov Models

There are three problems, canonically, that are solved with the use of Hidden Markov Models:


  • Given an HMM, and a sequence of outputs, evaluate the probability of the HMM producing the given output.
  • In the same conditions, determine the most likely sequence of states that would have produced the given output.
  • Given just a sequence of outputs and some states, create an HMM that would best account for the given output.

For the first, the probability is calculated by multiplying the probability of each individual output. A simple example is a single-state HMM that has the following outputs: 'A', with probability .5, 'B' with probability .3, and 'C' with probability .2. If the output sequence is ABCABC, multiplying the probabilities gives .0009. Clearly, as the length of a sequence increases, the probability of it being output drops; however, we are generally more concerned with comparative probabilities. For example, the sequence AAAAAA is much more likely to have been generated by our model, with a probability of 0.015625.

For the second, you might have a two-state HMM. In state 1, it outputs 'A' with a probability of .5 and 'B' with a probability of .5.  In state 2, it outputs 'A' with a probability of .8, 'B' with a probability of .1, and 'C' with a probability of .1. Given the sequence CCC, we know that the model is in state 2 the entire time,as state 1 has no possibility of outputting a C. But if the sequence is CAA, we don't know the state that the model was in when it generated the second or third outputs. But if we also have a model parameter that says the probability of transitioning from state 2 to state 1 is .01, we can see that it is still overwhelmingly likely that the model never exited state 2. Given an output sequence of a few hundred characters, however, the likelihood is very high for an occasional jump into state 1.

The third is more complicated, as it involves calculating the most likely parameters for the model based only on the output sequence. There are algorithms for calculating them, however.

No comments:

Post a Comment