Recurrence in the countable-space case

Recurrence is a fundamental concept related to the existence of invariant measures. We explore such concept here.

Setting

Here, we assume that $(X_n)_n$ is a time-homogeneous, discrete-time Markov chain with a countable state space. More precisely, we assume the indices are $n=0, 1, 2, \ldots,$ and that the space $\mathcal{X}$ is finite or countably infinite. The sample space is the probability space $(\Omega, \mathcal{F}, \mathbb{P}),$ where $\mathcal{F}$ is the $\sigma$-algebra on the set $\Omega$ and $\mathbb{P}$ is the probability distribution. The one-step transition distribution is denoted by $K(x, y) = \mathbb{P}(X_{n+1} = y | X_n = x),$ and is independent of $n=0, 1, \ldots,$ thanks to the time-homogeneous assumption. Similary, the $n$-step transition distribution is denoted $K_n(x, y) = \mathbb{P}(X_{k+n} = y | X_k = x),$ for $n=1, 2, \ldots,$ independently of $k=0, 1, \ldots.$

Definitions

We start with some fundamental definitions.

Return time

Definition (return time)

Given a point $x\in\mathcal{X},$ we define the return time to $x$ by

\[ \tau_x = \inf\left\{n\in\mathbb{N}\cup\{+\infty\}; \; n = \infty \textrm{ or } X_n = x\right\}.\]

We call it return time, but, in the definition itself, we do not condition it on $X_0 = x,$ so it is not always a "return" time. Per se, the quantity

\[ \mathbb{P}(\tau_x < \infty)\]

is just the probability of reaching $x$ in a finite number of steps. When conditioning it to $X_0 = x,$ then,

\[ \mathbb{P}(\tau_x < \infty | X_0 = x)\]

is indeed the probability of returning to $x$ in a finite number of steps.

Meanwhile, for $y \neq x,$

\[ \mathbb{P}(\tau_y < \infty | X_0 = x)\]

is the probability of reaching $y$ from $x$ in a finite number of steps.

Number of passages

Another useful quantity is the random variable denoting the number of passages through a given state $x\in\mathcal{X},$

Definition (number of passages)

The number of passages through a given state $x\in\mathcal{X}$ is defined by

\[ \eta_x = \sum_{n=1}^\infty \mathbb{1}_{\{X_n = x\}}.\]

Notice we did not include the starting time $n=0$ in the definition. Some authors do, while others don't (e.g. Robert and Casella (2004) don't, while Lawler(2006) does).

Relations between return time and number of visits

There are some important relations between return time and number of visits. Indeed, the first return time is finite if, and only if, the state is visited at least once. This is valid for each sample point. We can express this as

\[ \tau_x(\omega) < \infty \quad \Longleftrightarrow \quad \eta_x(\omega) \geq 1,\]

for every $\omega\in\Omega.$ As a consequence,

\[ \mathbb{P}(\tau_x < \infty | X_0 = x) = \mathbb{P}(\eta_x \geq 1 | X_0 = x).\]

The complement of that is

\[ \mathbb{P}(\tau_x = \infty | X_0 = x) = \mathbb{P}(\eta_x = 0 | X_0 = x).\]

More generally, the chances of having multiple visits is a power of the return time. This follows from the Markovian property, since once back to $x$ for the $m-1$ time, the chances of coming back again is the same as coming back for the first time.

Proposition (return time and number of visits)

Let $x\in\mathcal{X}$ and set

\[ q = \mathbb{P}(\tau_x < \infty | X_0 = x).\]

Then,

\[ \mathbb{P}(\eta_x \geq m | X_0 = x) = q^m,\]

and

\[ \mathbb{P}(\eta_x = m | X_0 = x) = q^m(1 - q),\]

for any $m\in\mathbb{N}.$

Proof

We have

\[ \begin{align*} \mathbb{P}(\eta_x \geq m | X_0 = x) & = \mathbb{P}\bigg( \exist n_1, \ldots, n_m\in \mathbb{N}, X_i = x, n_{m-1} < i \leq n_m \Leftrightarrow i = n_m \\ & \hspace{1in} \bigg| X_j = x, 0 \leq j \leq n_{m-1} \Leftrightarrow j = 0, n_1, \ldots, n_{m-1}\bigg) \\ & \quad \times \mathbb{P}\bigg( \exist n_1, \ldots, n_{m-1} \in \mathbb{N}, X_i = x, n_{m-2} < i \leq n_{m-1} \Leftrightarrow i = n_{m-1} \\ & \hspace{1in} \bigg| X_j = x, 0 \leq j \leq n_{m-2} \Leftrightarrow j = 0, n_1, \ldots, n_{m-2}\bigg) \\ & \quad \times \cdots \\ & \quad \times \mathbb{P}\bigg( \exist n_1 \in \mathbb{N}, X_i = x, 0 < i \leq n_1 \Leftrightarrow i = n_1 \bigg| X_0 = x \bigg) \end{align*}\]

By the Markov property of the chain, only the most recent conditioned state is important, so that

\[ \begin{align*} \mathbb{P}(\eta_x \geq m | X_0 = x) & = \mathbb{P}\left( \exist n_1, \ldots, n_m\in \mathbb{N}, X_i = x, 1\leq i \leq n_m \Leftrightarrow i = n_1, \ldots,n_m | X_0 = x\right) \\ & = \mathbb{P}\bigg( \exist n_{m-1}, n_m\in \mathbb{N}, X_i = x, n_{m-1} < i \leq n_m \Leftrightarrow i = n_m \bigg| X_{n_{m-1}} = x\bigg) \\ & \;\; \times \mathbb{P}\bigg( \exist n_{m-2}, n_{m-1} \in \mathbb{N}, X_i = x, n_{m-2} < i \leq n_{m-1} \Leftrightarrow i = n_{m-1} \bigg| X_{n_{m-2}} = x \bigg) \\ & \;\; \times \cdots \\ & \;\; \times \mathbb{P}\bigg( \exist n_1 \in \mathbb{N}, X_i = x, 0 < i \leq n_1 \Leftrightarrow i = n_1 \bigg| X_0 = x \bigg) \end{align*}\]

By the time-homogeneous property, we can shift each probability by $-n_{k-1}$ to see that only the time differences $d_k = n_{k} - n_{k-1}$ matters, for $k=1, \ldots, m,$ with all the events conditioned at the initial time $n_0 = 0,$, i.e.

\[ \begin{align*} \mathbb{P}(\eta_x \geq m | X_0 = x) & = \mathbb{P}\bigg( \exist d_m \in \mathbb{N}, X_i = x, 0 < i \leq d_m \Leftrightarrow i = d_m \bigg| X_0 = x\bigg) \\ & \;\; \times \mathbb{P}\bigg( \exist d_{m-1} \in \mathbb{N}, X_i = x, 0 < i \leq d_{m-1} \Leftrightarrow i = d_{m-1} \bigg| X_0 = x \bigg) \\ & \;\; \times \cdots \\ & \;\; \times \mathbb{P}\bigg( \exist d_1 \in \mathbb{N}, X_i = x, 0 < i \leq d_1 \Leftrightarrow i = d_1 \bigg| X_0 = x \bigg) \end{align*}\]

The difference is just a matter of notation, for which we can denote them all by $d,$ and write

\[ \mathbb{P}(\eta_x \geq m | X_0 = x) = \mathbb{P}\bigg( \exist d \in \mathbb{N}, X_i = x, 0 < i \leq d \Leftrightarrow i = d \bigg| X_0 = x\bigg)^m.\]

The existence of one such $d$ is equivalent to $\eta_x \geq 1,$ which is equivalent to $\tau_x < \infty,$ se we can write

\[ \mathbb{P}(\eta_x \geq m | X_0 = x) = \mathbb{P}(\tau_x < \infty | X_0 = x)^m = q^m,\]

for all $m\in\mathbb{N},$ which proves the first statement.

Now, the events $\tau_x = m$ and $\tau_x \geq m+1$ are independent, so that

\[ \begin{align*} \mathbb{P}(\eta_x = m | X_0 = x) & = \mathbb{P}(\eta_x \geq m | X_0 = x) - \mathbb{P}(\eta_x \geq m + 1 | X_0 = x) \\ & = q^m - q^{m+1} \\ & = q^m (1 - q), \end{align*}\]

which completes the proof.

Recurrence and Transience

As the name says it, recurrence refers to a property that occurs repeatedly in time. In the case of a state $x\in\mathcal{X},$ we are interested in knowning how often the state is observed, i.e. how often $X_n = x,$ with respect to $n\in\mathbb{N}.$

The idea is that a state that is not visited after some instant in time is, in some sense, transient, while others that get visited infinitely often in time are recurrent. But since this is a stochastic process, we must quantify how often this happens in a probabilistic way, i.e. with respect to the underlying probability measure. This can be measured by the probability of a state to be visited infinitely often in the chain, i.e.

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x).\]

If the probability is one, we will almost surely observe this state infinitely many times. If the probability is zero, we will almost surely observe this state at most a finite number of times. What if the probability is in between zero and one? Could it be in a superpositioned state, i.e. in some instances it is visited infinitely often while in other instances it is visited only finitely-many times? Fortunately, this cannot happen. The probability is either zero or one, and we can definitely characterize it as recurrent or transient.

Proposition"

Consider a state $x\in\mathcal{X}.$ Then either

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = 1\]

or

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = 0\]

Proof

The idea is to use the Markovian property, that if the probability of being visited once after the initial time is $q,$ then, the probability of having $m$ visits is $q^m,$ to deduce that, at the limit $m\rightarrow \infty,$ it is either zero or one, depending on whether $0 \leq q < 1$ or $q = 1.$

More precisely, we know that

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = \mathbb{P}(\eta_x = \infty | X_0 = x).\]

This means

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = \mathbb{P}(\eta_x \geq m, \;\forall m\in\mathbb{N} | X_0 = x).\]

But

\[ \{\eta_x \geq m, \;\forall m\in\mathbb{N}, X_0 = x \} = \bigcap_{m\in\mathbb{N}}\{\eta_x \geq m, X_0 = x\},\]

with the intersection being of non-increasing sets, with respect to increasing $m\in\mathbb{N}.$ Thus, by the continuity of probability measures,

\[ \mathbb{P}(\eta_x \geq m, \;\forall m\in\mathbb{N}, X_0 = x) = \lim_{m\rightarrow} \mathbb{P}(\eta_x \geq m, X_0 = x).\]

Similarly,

\[ \mathbb{P}(\eta_x \geq m, \;\forall m\in\mathbb{N} | X_0 = x) = \lim_{m\rightarrow} \mathbb{P}(\eta_x \geq m | X_0 = x).\]

We have already seen that

\[ \mathbb{P}(\eta_x \geq m | X_0 = x) = q^m,\]

where

\[ q = \mathbb{P}(\tau_x < \infty | X_0 = x).\]

Thus,

\[ \mathbb{P}(\eta_x \geq m, \;\forall m\in\mathbb{N} | X_0 = x) = \lim_{m\rightarrow} q^m.\]

Clearly,

\[ \lim_{m\rightarrow} q^m = 0, \quad \textrm{if } 0 < q \leq 1,\]

and

\[ \lim_{m\rightarrow} q^m = 1, \quad \textrm{if } q = 1.\]

Since $q$ is a probability, which can only assume values in the range $0\leq q \leq 1,$ the only possible limits are 0 and 1, proving the result.

With this in mind, we have the following definitions.

Definition (recurrent and transient states)

A state $x$ is called recurrent when

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = 1,\]

and is called transient when

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = 0,\]

with no intermediate values being possible.

Equivalent definitions of recurrence can be made with the notions of number of passages and return time. In that regard, we have the following result, which we borrow, essentially, from Lawler(2006), except that we do not include the time $n=0$ in the number of passages, so the formula is slightly different.

Theorem

For any state $x\in\mathcal{X},$ we have

\[ x \textrm{ is recurrent } \quad \Longleftrightarrow \quad \mathbb{P}(\tau_x < \infty | X_0 = x) = 1 \quad \Longleftrightarrow \quad \mathbb{E}[\eta_x | X_0 = x] = \infty,\]

and

\[ x \textrm{ is transient } \quad \Longleftrightarrow \quad \mathbb{P}(\tau_x < \infty | X_0 = x) < 1 \quad \Longleftrightarrow \quad \mathbb{E}[\eta_x | X_0 = x] < \infty.\]

Moreover, we have the relation

\[ \mathbb{E}[\eta_x | X_0 = x] = \frac{\mathbb{P}(\tau_x < \infty | X_0 = x)}{1 - \mathbb{P}(\tau_x < \infty | X_0 = x)},\]

with the understanding that the left hand side is infinite when the probability in the right hand side is 1.

Proof

We have that $\tau_x = \infty$ iff $X_n$ never returns to $x.$ If $\tau_x < \infty,$ then it returns to $x$ in finite time at least once. If $\tau_x < \infty$ with probability one, then, with probability one, it will return again and again to $x,$ still with probability one, since the countable intersection of sets of full measure still has full measure. Thus,

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = 1 \quad \Longleftrightarrow \quad \mathbb{P}(\tau_x < \infty | X_0 = x) = 1.\]

This proves that $x$ is recurrent if, and only if, $\mathbb{P}(\tau_x < \infty | X_0 = x) = 1,$ which is the first part of the equivalence in the recurrence case. The complement of that is precisely that $x$ is transient if, and only if, $\mathbb{P}(\tau_x < \infty | X_0 = x) < 1.$

For the remaining equivalences, let us suppose, first, that $x$ is transient. Then, as we have seen,

\[ q = \mathbb{P}(\tau_x < \infty | X_0 = x) < 1.\]

We compute the expectation of the number of passages by

\[ \begin{align*} \mathbb{E}[\eta_x | X_0 = x] & = \mathbb{E}\left[ \sum_{n=1}^\infty \mathbb{1}_{X_n = x} \bigg| X_0 = x\right] = \sum_{n=1}^\infty \mathbb{E}\left[\mathbb{1}_{X_n = x} \bigg| X_0 = x\right] \\ & = \sum_{n=1}^\infty \mathbb{P}(X_n = x | X_0 = x) = \sum_{n=1}^\infty p_n(x, x). \end{align*}\]

We can also compute this in a different way. Since $\eta_x$ is always an integer, its expectation is given by

\[ \mathbb{E}[\eta_x | X_0 = x] = \sum_{m=1}^\infty m \mathbb{P}(\eta_x = m | X_0 = x).\]

We need a way to calculate $\mathbb{P}(\eta_x = m | X_0 = x),$ for each integer $m\in\mathbb{N}.$ When $\eta_x = m,$ it means it returns to $x$ $m$ times and then it does not return anymore. This means that $\mathbb{P}(\eta_x = m | X_0 = x) = q^m.$ Thus,

\[ \mathbb{E}[\eta_x | X_0 = x] = \sum_{m=1}^\infty m q^m.\]

Let $S = \sum_{m=1}^\infty m q^m,$ so that $qS = \sum_{m=1}^\infty m q^{m+1} = \sum_{m=2}^\infty (m-1)q^m,$ and hence

\[ (1 - q)S = S - qS = q + \sum_{m=1}^\infty q^m = \sum_{m=1}^\infty q^m = \frac{q}{1 - q}.\]

Thus,

\[ \mathbb{E}[\eta_x | X_0 = x] = \frac{\mathbb{P}(\tau_x < \infty | X_0 = x)}{1 - \mathbb{P}(\tau_x < \infty | X_0 = x)}.\]

(In Lawler(2006), the number of passages includes the initial time $n=1,$ so that the formula obtained is $1/(1-q),$ instead of $q/(1 - q).$)

When

\[ q = \mathbb{P}(\tau_x < \infty | X_0 = x) = 1,\]

then

\[ \mathbb{P}(\tau_x < \infty | X_0 = x) \geq r,\]

for any $0 < r < 1,$ and we get, similarly, that

\[ \mathbb{E}[\eta_x | X_0 = x] \geq \sum_{m=1}^\infty m r^m = \frac{r}{(1 - r)} \rightarrow \infty,\]

as $r \rightarrow 1,$ so that $\mathbb{E}[\eta_x | X_0 = x] = \infty.$

This proves the identity between the expectation and the probability. In particular, the expectation is finite if, and only if, the probability is strictly less than one, which proves the remaining equivalences.

These equivalences are true, in general, only in the countable case; see page 222 of Robert and Casella (2004).

Another important aspect is that, when the probability of returning infinitely often is smaller than 1, its is actually zero.

Fact

If $x$ is a transient state, then, in fact,

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = 0.\]

In the countable-space case that we are considering, an equivalent definition of recurrence can be made with the notion of return time.

Another equivalent definition of recurrence can be made with the notion number of passages. Indeed, when the chain keeps coming back infinitely often to a state $x,$ counting such times adds up to infinity. When this happens with probability one or even with positive probability, then the expectation of such counts is also infinite. On the other hand, if $x$ is transient, then, in fact, $\mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = 0,$ so $\eta_x$ is finite almost surely, but that doesn't quise say that its expectation is finite. However, it is true, somehow, that we have the following equivalence.

Fact

A state $x$ is recurrent if, and only if,

\[ \mathbb{E}\left[\eta_x\right | X_0 = x] = \infty.\]

When every state is recurrent we say that the chain is recurrent.

Definition (recurrent chain)

The Markov chain is called recurrent when every state is recurrent, i.e.

\[ \mathbb{P}(X_n = x \textrm{ infinitely often} | X_0 = x) = 1, \quad \forall x\in\mathcal{X}.\]

For instance, we have seen that the random walk $X_{n+1} = X_n + B_n,$ where $B_n$ are Bernoulli i.i.d. with states $+1$ with probability $p$ and $-1$ with probability $1 - p,$ where $0 < p < 1,$ but it is not recurrent. For any $x, \in\mathcal{X}=\mathbb{Z}$ and $n\in\mathbb{N},$ the transition probability $K_n(x, y)$ is zero if $|y - x|$ and $n$ have different parity or when $n < |y - x|,$ and is given by the binomial distribution

\[ p_n(m) = \begin{pmatrix} n \\ i \end{pmatrix} p^i(1-p)^j,\]

with

\[ i = \frac{n + m}{2}, \quad j = n - i = \frac{n - m}{2}, \quad m = y - x,\]

when

\[ |m| \leq n, \quad m, n \textrm{ with same parity}\]

Thus, we can write

\[ K_n(x, y) = \begin{cases} \begin{pmatrix} n \\ \frac{n + y - x}{2} \end{pmatrix} p^{(n+y-x)/2}(1-p)^{(n + x - y)/2}, & |x - y| \leq n, \; x - y, n \textrm{ same parity}, \\ 0, & \textrm{otherwise}. \end{cases}\]

In particular, the probability of returning to $x$ is

\[ K_n(x, x) = \begin{cases} \begin{pmatrix} n \\ \frac{n}{2} \end{pmatrix} p^{n/2}(1-p)^{n/2}, & n \textrm{ is even}, \\ 0, & \textrm{otherwise}. \end{cases}\]

We have

\[ \{X_n = x \textrm{ infinitely often} | X_0 = x\} = \bigcap_{n \in\mathbb{N}}\bigcup_{m \geq n, m\in \mathbb{N}} \{X_m = x | X_0 = x\}.\]

Thus, by the Borel-Cantelli Lemma,

\[ \mathbb{P}\left(X_n = x \textrm{ infinitely often} | X_0 = x\right) = 0,\]

since

\[ \sum_{n\in\mathbb{N}} \mathbb{P}\left(X_m = x | X_0 = x\right) = \sum_{n\in\mathbb{N}, n \textrm{ even}} \begin{pmatrix} n \\ \frac{n}{2} \end{pmatrix} p^{n/2}(1-p)^{n/2} = \sum_{n\in\mathbb{N}} \begin{pmatrix} 2n \\ n \end{pmatrix} p^n(1-p)^n \approx \sum_{n\in\mathbb{N}} \frac{(2n)!}{2(n!)} (1/2)^n\]

Existence of invariant distribution

Theorem

Suppose that, for some $x\in\mathcal{X},$

\[ \mathbb{E}[\tau_{x} | X_0 = x] < \infty.\]

Then

\[ P_x(z) = \frac{1}{\mathbb{E}[\tau_{x} | X_0 = x]} \sum_{n=1}^\infty \mathbb{P}(X_n = z, \tau_{x} \geq n | X_0 = x)\]

defines an invariant probability distribution for the Markov chain.

The proof below is adapted from Theorem 6.37 of Robert and Casella (2004). Notice we are not assuming irreducibility, so we are not claiming that this invariant distribution is unique.

Proof

For the proof, we consider, for the sake of simplicity, the unnormalized measure

\[ {\tilde P}_x(z) = \sum_{n=1}^\infty \mathbb{P}(X_n = z, \tau_{x} \geq n | X_0 = x).\]

We show that it defines a positive and finite invariant measure, with ${\tilde P}_x(\mathcal{X}) = \mathbb{E}[\tau_{x} | X_0 = x].$ After that, we obtain the desired result by normalizing ${\tilde P}_x$ by the expectation $\mathbb{E}[\tau_{x} | X_0 = x].$

First of all, since the space is discrete and each ${\tilde P}_x(z) \geq 0,$ for $z\in\mathcal{X},$ it follows that ${\tilde P}_x$ defines indeed a measure on $\mathcal{X}.$ Let us check that ${\tilde P}_x$ is in fact a nontrivial and finite measure. We have

\[ \begin{align*} {\tilde P}_x(\mathcal{X}) & = \sum_{z\in\mathcal{X}} P_x(z) \\ & = \sum_{z\in\mathcal{X}}\sum_{n=1}^\infty \mathbb{P}(X_n = z, \tau_{x} \geq n | X_0 = x) \\ & = \sum_{n=1}^\infty \sum_{z\in\mathcal{X}} \mathbb{P}(X_n = z, \tau_{x} \geq n | X_0 = x) \\ & = \sum_{n=1}^\infty \mathbb{P}( \tau_{x} \geq n | X_0 = x) \\ & = \sum_{n=1}^\infty \sum_{m=n}^\infty \mathbb{P}(\tau_{x} = m | X_0 = x) \\ & = \sum_{m=1}^\infty \sum_{n=1}^{m} \mathbb{P}(\tau_{x} = m | X_0 = x) \\ & = \sum_{m=1}^\infty m \mathbb{P}(\tau_{x} = m | X_0 = x) \\ & = \mathbb{E}[\tau_{x} | X_0 = x] \end{align*}\]

Since it is assumed that

\[ \mathbb{E}[\tau_{x} | X_0 = x] < \infty,\]

it follows that the measure ${\tilde P}$ is finite. And since, by definition, $\tau_x$ is an integer with $1 \leq \tau_x \leq +\infty,$ the measure is non-trivial, i.e. it is positive, with

\[ 1 \leq {\tilde P}(\mathcal{X}) < \infty.\]

In particular,

\[ \begin{align*} {\tilde P}(\mathcal{x}) & = \sum_{n=1}^\infty \mathbb{P}(X_n = x, \tau_{x} \geq n | X_0 = x) \\ & = \sum_{n=1}^\infty \mathbb{P}(X_n = x, \tau_{x} = n | X_0 = x) \\ & = \sum_{n=1}^\infty \mathbb{P}(\tau_{x} \geq n | X_0 = x) \\ & = \mathbb{P}(\cup_{n\in\mathbb{N}} \{\tau_{x} = n | X_0 = x\}) \\ & = \mathbb{P}(\tau_{x} < \infty | X_0 = x). \end{align*}\]

The condition

\[ \mathbb{E}[\tau_{x} | X_0 = x] < \infty.\]

implies that $\tau_{x}$ must be finite almost surely, so that

\[ {\tilde P}(\mathcal{x}) = \mathbb{P}(\tau_{x} < \infty | X_0 = x) = 1.\]

Now, let us check that ${\tilde P}_x$ is invariant. We need to show that

\[ \sum_{y\in\mathcal{X}} K(y, z){\tilde P}_x(y) = {\tilde P}_x(z),\]

for every $z\in\mathcal{X}.$ For that, we write

\[ \begin{align*} \sum_{y\in\mathcal{X}} K(y, z){\tilde P}_x(y) & = \sum_{y\in\mathcal{X}} K(y, z)\sum_{n=1}^\infty \mathbb{P}(X_n = y, \tau_{x} \geq n | X_0 = x) \\ & = \sum_{n=1}^\infty \sum_{y\in\mathcal{X}} K(y, z) \mathbb{P}(X_n = y, \tau_{x} \geq n | X_0 = x) \end{align*}\]

We split the sum according to $y=x$ and $y\neq x,$ so that

\[ \begin{align*} \sum_{y\in\mathcal{X}} K(y, z){\tilde P}_x(y) & = \sum_{n=1}^\infty K(x, z) \mathbb{P}(X_n = x, \tau_{x} \geq n | X_0 = x) \\ & \qquad + \sum_{n=1}^\infty \sum_{y\neq x} K(y, z) \mathbb{P}(X_n = y, \tau_{x} \geq n | X_0 = x) \\ & = K(x, z) {\tilde P}(x) + \sum_{n=1}^\infty \sum_{y\neq x} \mathbb{P}(X_{n+1} = z, X_n = y, \tau_{x} \geq n | X_0 = x) \\ \end{align*}\]

For $y\neq z,$ we have

\[ \mathbb{P}(X_{n+1} = z, X_n = y, \tau_{x} \geq n | X_0 = x) = \mathbb{P}(X_{n+1} = z, X_n = y, \tau_{x} \geq n+1 | X_0 = x).\]

Thus,

\[ \begin{align*} \sum_{y\in\mathcal{X}} K(y, z){\tilde P}_x(y) & = K(x, z) {\tilde P}(x) + \sum_{n=1}^\infty \sum_{y\neq x} \mathbb{P}(X_{n+1} = z, X_n = y, \tau_{x} \geq n+1 | X_0 = x) \\ & = K(x, z) {\tilde P}(x) + \sum_{n=1}^\infty \mathbb{P}(X_{n+1} = z, \tau_{x} \geq n+1 | X_0 = x) \\ & = K(x, z) {\tilde P}(x) + \sum_{n=2}^\infty \mathbb{P}(X_{n} = z, \tau_{x} \geq n | X_0 = x), \end{align*}\]

where in the last step we just reindexed the summation. Now we use that ${\tilde P}(x) = 1$ (as proved above) and that

\[ K(x, z) = \mathbb{P}(X_1 = z | X_0 = x) = \mathbb{P}(X_1 = z, \tau_{x} \geq 1 | X_0 = x)\]

(since $\tau_{x} \geq 1$ always), to obtain

\[ \begin{align*} \sum_{y\in\mathcal{X}} K(y, z){\tilde P}_x(y) & = \mathbb{P}(X_1 = z, \tau_{x} \geq 1 | X_0 = x) + \sum_{n=2}^\infty \mathbb{P}(X_{n} = z, \tau_{x} \geq n | X_0 = x) \\ & = \sum_{n=1}^\infty \mathbb{P}(X_{n} = z, \tau_{x} \geq n | X_0 = x) \\ & = {\tilde P}(z), \end{align*}\]

proving the invariance.

Now, as mentioned above, since ${\tilde P}(\mathcal{X}) = \mathbb{E}[\tau_{x} | X_0 = x]$ is finite and positive, we can normalize ${\tilde P}$ by this expectation to obtain the invariant probability distribution

\[ P(z) = \frac{1}{\mathbb{E}[\tau_{x} | X_0 = x]} \sum_{n=1}^\infty \mathbb{P}(X_{n} = z, \tau_{x} \geq n | X_0 = x).\]

Kac's Theorem on invariant distribution of an irreducible chain

Kac's Theorem says that if $(X_n)_n$ is irreducible and it has a stationary distribution $P,$ then

\[ P(x) = \frac{1}{\mathbb{E}[\tau_x | X_n = x]}, \quad \forall x\in\mathcal{X}.\]

The idea is that, $m_x = \mathbb{E}[\tau_x | X_n = x]$ is the average time that the chain takes to start from $x$ and come back to $x.$ Every state $x$ is visited repeatedly, with an average time of $m_x$ between consecutive visits.

Irreducibility in the discrete-space case

In the discrete-space case, $P$-irreducibility can be stated with point sets, so that the Markov chain is $P$-irreducible if, and only if,

\[ x, y\in\mathcal{X},\; P(y) > 0 \Longrightarrow \exists n=n(x, y)\in\mathbb{N}, \; K_n(x, y) > 0.\]

Positivity of an invariant probability distribution

If a discrete-space Markov chain has an invariant probability distribution $P$ and it is $P$-irreducible, then $P$ must be everywhere strictly positive, i.e. $P(x) > 0,$ for all $x\in \mathcal{X}.$

Indeed, since $P$ is nontrivial, there exists $y\in\mathcal{X}$ such that $P(y) > 0.$ Now, for any $x,$ since the chain is $P$ irreducible, we have $K_n(x, y) > 0,$ for some $n\in\mathbb{N}.$

Since $P$ is invariant, we have $P K_n = P,$ and then

\[ P(x) = \sum_{z\in\mathcal{X}} K_n(z, x) P(z).\]

This can be estimated from below by restricting the summation to $z = y,$ so that

\[ P(x) = \sum_{z\in\mathcal{X}} K_n(z, x) P(z) \geq K_n(y, x) P(z) > 0,\]

where we used that $K_n(y, x) > 0$ and $P(z) > 0.$ Thus,

\[ P(x) > 0, \quad \forall x\in\mathcal{X},\]

showing that $P$ is everywhere strictly positive.

Uniqueness of the invariant probability distribution

Suppose, again, that we have a finite-state Markov chain with an invariant probability distribution $P$ which is $P$-irreducible. We have just seen that $P$ must be everywhere strictly positive, i.e. $P(x) > 0,$ for all $x\in \mathcal{X}.$ We now use this to show that the invariant probability distribution must be unique.

References

  1. C. P. Robert, G. Casella (2004), "Monte Carlo Statistical Methods", Second Edition, Springer Texts in Statistics, Springer New York, NY
  2. G. F. Lawler(2006), "Introduction to Stochastic Processes", 2nd Edition. Chapman and Hall/CRC, New York.