Mathwords logoMathwords

Markov Chain — Definition, Formula & Examples

A Markov chain is a sequence of random events where the probability of what happens next depends only on the current state, not on the history of how you got there. This 'memoryless' property is called the Markov property.

A Markov chain is a stochastic process {Xn}n0\{X_n\}_{n \geq 0} taking values in a countable state space SS such that for all n0n \geq 0 and all states i0,i1,,in,jSi_0, i_1, \ldots, i_n, j \in S, the process satisfies P(Xn+1=jXn=in,Xn1=in1,,X0=i0)=P(Xn+1=jXn=in)P(X_{n+1} = j \mid X_n = i_n, X_{n-1} = i_{n-1}, \ldots, X_0 = i_0) = P(X_{n+1} = j \mid X_n = i_n). When these transition probabilities do not depend on nn, the chain is called time-homogeneous.

Key Formula

π(n)=π(0)Pn\pi^{(n)} = \pi^{(0)} \, P^n
Where:
  • π(n)\pi^{(n)} = Row vector of state probabilities after n steps
  • π(0)\pi^{(0)} = Row vector of initial state probabilities
  • PP = Transition matrix where entry p_{ij} is the probability of moving from state i to state j
  • nn = Number of time steps

How It Works

You model a system by identifying its possible states and assigning a transition probability pijp_{ij} for moving from state ii to state jj. These probabilities are organized into a transition matrix PP, where each row sums to 1. To find the probability distribution after nn steps, you multiply the initial state vector π0\pi_0 by PnP^n. A key question is whether the chain converges to a stationary distribution π\pi satisfying πP=π\pi P = \pi, meaning the long-run proportions of time in each state stabilize regardless of the starting state.

Worked Example

Problem: A weather model has two states: Sunny (S) and Rainy (R). If today is sunny, tomorrow is sunny with probability 0.8 and rainy with probability 0.2. If today is rainy, tomorrow is sunny with probability 0.4 and rainy with probability 0.6. If today is sunny, what is the probability distribution for the weather two days from now?
Set up: Write the transition matrix P and the initial state vector.
P=(0.80.20.40.6),π(0)=(10)P = \begin{pmatrix} 0.8 & 0.2 \\ 0.4 & 0.6 \end{pmatrix}, \quad \pi^{(0)} = \begin{pmatrix} 1 & 0 \end{pmatrix}
Compute P²: Multiply P by itself to get the two-step transition matrix.
P2=(0.8(0.8)+0.2(0.4)0.8(0.2)+0.2(0.6)0.4(0.8)+0.6(0.4)0.4(0.2)+0.6(0.6))=(0.720.280.560.44)P^2 = \begin{pmatrix} 0.8(0.8)+0.2(0.4) & 0.8(0.2)+0.2(0.6) \\ 0.4(0.8)+0.6(0.4) & 0.4(0.2)+0.6(0.6) \end{pmatrix} = \begin{pmatrix} 0.72 & 0.28 \\ 0.56 & 0.44 \end{pmatrix}
Find the distribution: Multiply the initial vector by P².
π(2)=(10)(0.720.280.560.44)=(0.720.28)\pi^{(2)} = \begin{pmatrix} 1 & 0 \end{pmatrix} \begin{pmatrix} 0.72 & 0.28 \\ 0.56 & 0.44 \end{pmatrix} = \begin{pmatrix} 0.72 & 0.28 \end{pmatrix}
Answer: Two days from now, there is a 72% chance of sunny weather and a 28% chance of rain.

Why It Matters

Markov chains are foundational in courses on stochastic processes and are used extensively in queueing theory, genetics, finance, and natural language processing. Google's original PageRank algorithm modeled web surfing as a Markov chain to rank search results.

Common Mistakes

Mistake: Assuming the Markov property means each state is equally likely or that transitions are independent of the current state.
Correction: The Markov property only means the future is independent of the past given the present. Transition probabilities still depend on which state you are currently in.