Named after Russian mathematician A.A. Markov (1856-1922), Markov chains are a special kind of “memoryless” stochastic process. We say Markov chains are “memoryless” because at any given instant in the chain, the state of the system depends only on where it was in its previous instant; what happened before that is of no consequence, and past conditions have no impact on where the system will proceed from its present state.

### Markov chain: simple random walk

The simplest example of a Markov chain is the simple random walk that I’ve written about in previous articles. (For those unfamiliar with random walks or stochastic processes, I recommend reading those articles before continuing with this one.)

In a simple random walk, where we end up next on the number line depends only on where we stand now. For example, if we’re currently standing at +3, we have a 50% chance that we will move up to +4, and a 50% chance we will move down to +2. The path that got us to our current state of +3 is of no consequence; the possible future paths depend only on our current state, not on how we got here.

To quote Athanasios Papoulis, author of the text *Probability, Random Variables and Stochastic Processes*, “Markov processes represent the simplest generalization of independent processes by permitting the outcome at any instant to depend only on the outcome that precedes it and none before that”(Papoulis 695). Now let’s consider a few examples.

### Markov chain: family surname survival

Markov chains have direct application to many real-world phenomena. For example, we could model the survival of a family surname over *n* generations with a Markov chain.

Let’s say that a certain family has a surname Smith. Each generation of Smiths will yield a number of descendents, some of whom will marry and take the surname of their spouse, while others will keep the name Smith, and pass it onto their offspring. Each generation and the number of Smiths it contains can be considered a **finite state** in the Markov chain. The process is Markovian because the number of offspring who receive the surname Smith depends *only* on the surname of the parents. To put it another way, how many Smiths there were two generations ago or more will not determine the number of Smiths tomorrow; the future state of the system depends only on the current state, not the past states.

### Markov chains and transition probabilities

Now that we have an idea of what defines a Markov chain, let’s consider the concept of **transition probabilities**. Transition probabilities are the probabilities associated with moving from one state in the Markov chain to the next. For the simple random walk, the transition probabilities were 0.5 and 0.5; there was 0.5 chance you would move one integer up the number line, and a 0.5 chance you would move down one integer on the number line.

Now consider a particular genetic mutation that exists within a population of 100 organisms. Let’s say there is a *p* = 0.6 chance that this mutation gets passed on to the next generation, and a *q* = 0.4 chance that it will not. We could define our Markov chain to be the number of organisms *n* that carry the mutation *m* generations from now.

Recall that Markov chains are memoryless. This means that if we know the current state of the system, and we know the transition probabilities from the present state to possible future states, we have enough information to map *all the possible future states of the system, for every subsequent generation*. From this, we can answer such far-reaching questions as:

*m*generations from now?

*m*generations?

*n*goes to zero)?

The mathematics behind these questions can get complex and is beyond the scope of this article; however, for those interested, these lecture notes from Dartmouth do a great job of introducing more of the math behind these concepts. For now, let’s define another important feature of Markov chains: absorbing barriers.

### Absorbing barriers

For a Markov chain, an **absorbing barrier** is some possible future state that, once the system enters it, it cannot exit. For example, let’s consider the bankroll of a gambler playing roulette. To make it simple, let’s say this gambler places the same bet over and over with each spin of the wheel. Thus his bankroll at some future time *t* will be a Markov chain, fluctuating with time according to his wins and losses. We can imagine each **state** in this Markov process to be the value of the gambler’s bankroll, call it *x _{t}*, immediately following each spin of the roulette wheel.

Now, the **absorbing barrier** in this scenario is the gambler **going bust**. This is because, once the gambler is out of money (*x _{t}* = 0), the game is over, and the system (the bankroll) will not move from zero. (We are assuming the gambler has the wisdom not to go to the ATM machine.)

Earlier, when considering a simple random walk on the integer number line, we had a system with no absorbing barriers, and thus the random walk could proceed indefinitely. In real life, however, many Markovian processes do indeed involve absorbing barriers. Consider a game of tennis.

### Markov chain: a tennis match

(Note: For the following discussion, I am going to assume everyone understands the basic rules and scoring of tennis. If you don’t, here’s a link to the rules!)

In a game of tennis, let’s assume the server has a probability *p* of winning the point, and the receiver has a probability *q* = 1-p of winning the point. With some basic probability calculations, we could map out the probabilities of each point. These are given in the following table.

First point | P(0:15) = qP(15:0) = p |
---|---|

Second point | P(15:15) = 2pqP(0:30) = q^{2}P(30:0) = p^{2} |

Third Point | P(15:30) = 3pq^{2}P(30:15) = 3p^{2}qP(0:40) = q^{3}P(40:0) = p^{3} |

Fourth Point | P(server wins) = p^{4}P(receiver wins) = q^{4}P(40:15) = 4p^{3}qP(15:40) = 4pq^{3}P(deuce) = 6p^{2}q^{2} |

Fifth point | P(server wins) = p^{4}(1 + 4q)P(advantage in) = 4p^{3}q^{2}P(deuce) = 6p^{2}q^{2}P(advantage out) = 4p^{2}q^{3}P(receiver wins) = q^{4}(1 + 4p) |

If the game goes to deuce, the rule that a player must win by two points puts the game into a Markov chain with five possible states. Let’s refer to them as *e*_{1} = server wins, *e*_{2} = ad in, *e*_{3} = deuce, *e*_{4} = ad out, and *e*_{5} = receiver wins. Of these states, two of them are **absorption states** (*e*_{1}, *e*_{5}), and the remaining three are **transient states** (meaning the system can switch in and out of them). We can map the these states with a **transition probability matrix**:

We can make sense of the above as follows. Each row in the matrix corresponds to a state of the Markov chain (*e*_{1}, *e*_{2}, *e*_{3}, *e*_{4}, *e*_{5}, as defined above), and the probabilities listed are the transition probabilities of moving from one state to another.

Here’s how we could the second row, moving sequentially from left to right. Since this is the second row, the probabilities in this row correspond to moving from e2 to ei, some other state of the system.

Since it’s currently ad in, then the probability of the server winning the game is p (second row, first cell). This is *p*_{2,1}, or the probability of moving from *e*_{2} to *e*_{1}. The probability the point is played, and the system stays in ad in is 0 (*p*_{2,2} = 0). The probability the system moves back to deuce is *p*_{3,2} = *q*. The probability of the system moving immediately to ad out, *p*_{2,4} = 0, and the probability the receiver wins on this point is *p*_{2,5} = 0. Therefore, the transition matrix maps out all possible states of the system. The cells with probability 1 correspond to the absorption barriers; therefore, the probability of moving from *e*_{1} (server wins) to any other state is 0. In other words, once the game is over, it’s over.

Need more help understanding Markov chains? Check out our statistics blog and videos!

## Comments are closed.