Markov chains on a measurable state space

<h2 id="history">History</h2>
<p>The definition of Markov chains has evolved during the 20th century. In 1953 the term Markov chain was used for <a href="/facts/Stochastic_process/Ng1n1dnB">stochastic processes</a> with discrete or continuous index set, living on a countable or finite state space, see Doob.<a class="footnote-ref" id="fnref:1" href="#fn:1"><sup>1</sup></a> or Chung.<a class="footnote-ref" id="fnref:2" href="#fn:2"><sup>2</sup></a> Since the late 20th century it became more popular to consider a Markov chain as a stochastic process with discrete index set, living on a measurable state space.<a class="footnote-ref" id="fnref:3" href="#fn:3"><sup>3</sup></a><a class="footnote-ref" id="fnref:4" href="#fn:4"><sup>4</sup></a><a class="footnote-ref" id="fnref:5" href="#fn:5"><sup>5</sup></a>
</p>
<h2 id="definition">Definition</h2>
<p>Denote with 
  
    
      
        (
        E
        ,
        Σ
        )
      
    
    {\displaystyle (E,\Sigma )}
  
 a measurable space and with 
  
    
      
        p
      
    
    {\displaystyle p}
  
 a <a href="/facts/Markov_kernel/7HTE5nYm">Markov kernel</a> with source and target 
  
    
      
        (
        E
        ,
        Σ
        )
      
    
    {\displaystyle (E,\Sigma )}
  
.
A stochastic process 
  
    
      
        (
        
          X
          
            n
          
        
        
          )
          
            n
            ∈
            
              N
            
          
        
      
    
    {\displaystyle (X_{n})_{n\in \mathbb {N} }}
  
 on 
  
    
      
        (
        Ω
        ,
        
          
            F
          
        
        ,
        
          P
        
        )
      
    
    {\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} )}
  
 is called a time homogeneous Markov chain with Markov kernel 
  
    
      
        p
      
    
    {\displaystyle p}
  
 and start distribution 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
 if
</p>

P
        
        [
        
          X
          
            0
          
        
        ∈
        
          A
          
            0
          
        
        ,
        
          X
          
            1
          
        
        ∈
        
          A
          
            1
          
        
        ,
        …
        ,
        
          X
          
            n
          
        
        ∈
        
          A
          
            n
          
        
        ]
        =
        
          ∫
          
            
              A
              
                0
              
            
          
        
        …
        
          ∫
          
            
              A
              
                n
                −
                1
              
            
          
        
        p
        (
        
          y
          
            n
            −
            1
          
        
        ,
        
          A
          
            n
          
        
        )
        
        p
        (
        
          y
          
            n
            −
            2
          
        
        ,
        d
        
          y
          
            n
            −
            1
          
        
        )
        …
        p
        (
        
          y
          
            0
          
        
        ,
        d
        
          y
          
            1
          
        
        )
        
        μ
        (
        d
        
          y
          
            0
          
        
        )
      
    
    {\displaystyle \mathbb {P} [X_{0}\in A_{0},X_{1}\in A_{1},\dots ,X_{n}\in A_{n}]=\int _{A_{0}}\dots \int _{A_{n-1}}p(y_{n-1},A_{n})\,p(y_{n-2},dy_{n-1})\dots p(y_{0},dy_{1})\,\mu (dy_{0})}

<p>is satisfied for any 
  
    
      
        n
        ∈
        
          N
        
        ,
        
        
          A
          
            0
          
        
        ,
        …
        ,
        
          A
          
            n
          
        
        ∈
        Σ
      
    
    {\displaystyle n\in \mathbb {N} ,\,A_{0},\dots ,A_{n}\in \Sigma }
  
. One can construct for any Markov kernel and any probability measure an associated Markov chain.<a class="footnote-ref" id="fnref:6" href="#fn:6"><sup>6</sup></a>
</p>
<h3>Remark about Markov kernel integration</h3>
<p>For any <a href="/facts/Measure_(mathematics)/yCq7zlI4">measure</a> 
  
    
      
        μ
        :
        Σ
        →
        [
        0
        ,
        ∞
        ]
      
    
    {\displaystyle \mu \colon \Sigma \to [0,\infty ]}
  
 we denote for 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
-integrable function 
  
    
      
        f
        :
        E
        →
        
          R
        
        ∪
        {
        ∞
        ,
        −
        ∞
        }
      
    
    {\displaystyle f\colon E\to \mathbb {R} \cup \{\infty ,-\infty \}}
  
 the <a href="/facts/Lebesgue_integration/f4AgvC6x">Lebesgue integral</a> as 
  
    
      
        
          ∫
          
            E
          
        
        f
        (
        x
        )
        
        μ
        (
        d
        x
        )
      
    
    {\displaystyle \int _{E}f(x)\,\mu (dx)}
  
. For the measure 
  
    
      
        
          ν
          
            x
          
        
        :
        Σ
        →
        [
        0
        ,
        ∞
        ]
      
    
    {\displaystyle \nu _{x}\colon \Sigma \to [0,\infty ]}
  
 defined by 
  
    
      
        
          ν
          
            x
          
        
        (
        A
        )
        :=
        p
        (
        x
        ,
        A
        )
      
    
    {\displaystyle \nu _{x}(A):=p(x,A)}
  
 we used the following notation:
</p>

∫
          
            E
          
        
        f
        (
        y
        )
        
        p
        (
        x
        ,
        d
        y
        )
        :=
        
          ∫
          
            E
          
        
        f
        (
        y
        )
        
        
          ν
          
            x
          
        
        (
        d
        y
        )
        .
      
    
    {\displaystyle \int _{E}f(y)\,p(x,dy):=\int _{E}f(y)\,\nu _{x}(dy).}

<h2 id="basic-properties">Basic properties</h2>
<h3>Starting in a single point</h3>
<p>If 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
 is a <a href="/facts/Dirac_measure/mj8Qfhr2">Dirac measure</a> in 
  
    
      
        x
      
    
    {\displaystyle x}
  
, we denote for a Markov kernel 
  
    
      
        p
      
    
    {\displaystyle p}
  
 with starting distribution 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
  the associated Markov chain as 
  
    
      
        (
        
          X
          
            n
          
        
        
          )
          
            n
            ∈
            
              N
            
          
        
      
    
    {\displaystyle (X_{n})_{n\in \mathbb {N} }}
  
 on 
  
    
      
        (
        Ω
        ,
        
          
            F
          
        
        ,
        
          
            P
          
          
            x
          
        
        )
      
    
    {\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} _{x})}
  
 and the expectation value
</p>

E
          
          
            x
          
        
        [
        X
        ]
        =
        
          ∫
          
            Ω
          
        
        X
        (
        ω
        )
        
        
          
            P
          
          
            x
          
        
        (
        d
        ω
        )
      
    
    {\displaystyle \mathbb {E} _{x}[X]=\int _{\Omega }X(\omega )\,\mathbb {P} _{x}(d\omega )}

<p>for a 
  
    
      
        
          
            P
          
          
            x
          
        
      
    
    {\displaystyle \mathbb {P} _{x}}
  
-integrable function 
  
    
      
        X
      
    
    {\displaystyle X}
  
. By definition, we have then

P
          
          
            x
          
        
        [
        
          X
          
            0
          
        
        =
        x
        ]
        =
        1
      
    
    {\displaystyle \mathbb {P} _{x}[X_{0}=x]=1}
  
.
</p><p>We have for any measurable function 
  
    
      
        f
        :
        E
        →
        [
        0
        ,
        ∞
        ]
      
    
    {\displaystyle f\colon E\to [0,\infty ]}
  
 the following relation:<a class="footnote-ref" id="fnref:7" href="#fn:7"><sup>7</sup></a>
</p>

∫
          
            E
          
        
        f
        (
        y
        )
        
        p
        (
        x
        ,
        d
        y
        )
        =
        
          
            E
          
          
            x
          
        
        [
        f
        (
        
          X
          
            1
          
        
        )
        ]
        .
      
    
    {\displaystyle \int _{E}f(y)\,p(x,dy)=\mathbb {E} _{x}[f(X_{1})].}

<h3>Family of Markov kernels</h3>
<p>For a Markov kernel 
  
    
      
        p
      
    
    {\displaystyle p}
  
 with starting distribution 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
 one can introduce a family of Markov kernels 
  
    
      
        (
        
          p
          
            n
          
        
        
          )
          
            n
            ∈
            
              N
            
          
        
      
    
    {\displaystyle (p_{n})_{n\in \mathbb {N} }}
  
 by
</p>

p
          
            n
            +
            1
          
        
        (
        x
        ,
        A
        )
        :=
        
          ∫
          
            E
          
        
        
          p
          
            n
          
        
        (
        y
        ,
        A
        )
        
        p
        (
        x
        ,
        d
        y
        )
      
    
    {\displaystyle p_{n+1}(x,A):=\int _{E}p_{n}(y,A)\,p(x,dy)}

<p>for 
  
    
      
        n
        ∈
        
          N
        
        ,
        
        n
        ≥
        1
      
    
    {\displaystyle n\in \mathbb {N} ,\,n\geq 1}
  
 and 
  
    
      
        
          p
          
            1
          
        
        :=
        p
      
    
    {\displaystyle p_{1}:=p}
  
. For the associated Markov chain 
  
    
      
        (
        
          X
          
            n
          
        
        
          )
          
            n
            ∈
            
              N
            
          
        
      
    
    {\displaystyle (X_{n})_{n\in \mathbb {N} }}
  
 according to 
  
    
      
        p
      
    
    {\displaystyle p}
  
 and 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
 one obtains
</p>

P
        
        [
        
          X
          
            0
          
        
        ∈
        A
        ,
        
        
          X
          
            n
          
        
        ∈
        B
        ]
        =
        
          ∫
          
            A
          
        
        
          p
          
            n
          
        
        (
        x
        ,
        B
        )
        
        μ
        (
        d
        x
        )
      
    
    {\displaystyle \mathbb {P} [X_{0}\in A,\,X_{n}\in B]=\int _{A}p_{n}(x,B)\,\mu (dx)}
  
.
<h3>Stationary measure</h3>
<p>A probability measure 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
 is called stationary measure of a Markov kernel 
  
    
      
        p
      
    
    {\displaystyle p}
  
 if
</p>

∫
          
            A
          
        
        μ
        (
        d
        x
        )
        =
        
          ∫
          
            E
          
        
        p
        (
        x
        ,
        A
        )
        
        μ
        (
        d
        x
        )
      
    
    {\displaystyle \int _{A}\mu (dx)=\int _{E}p(x,A)\,\mu (dx)}

<p>holds for any 
  
    
      
        A
        ∈
        Σ
      
    
    {\displaystyle A\in \Sigma }
  
. If 
  
    
      
        (
        
          X
          
            n
          
        
        
          )
          
            n
            ∈
            
              N
            
          
        
      
    
    {\displaystyle (X_{n})_{n\in \mathbb {N} }}
  
 on 
  
    
      
        (
        Ω
        ,
        
          
            F
          
        
        ,
        
          P
        
        )
      
    
    {\displaystyle (\Omega ,{\mathcal {F}},\mathbb {P} )}

denotes the Markov chain according to a Markov kernel 
  
    
      
        p
      
    
    {\displaystyle p}
  
 with stationary measure 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
, and the distribution of 
  
    
      
        
          X
          
            0
          
        
      
    
    {\displaystyle X_{0}}
  
 is 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
, then all 
  
    
      
        
          X
          
            n
          
        
      
    
    {\displaystyle X_{n}}

have the same probability distribution, namely:
</p>

P
        
        [
        
          X
          
            n
          
        
        ∈
        A
        ]
        =
        μ
        (
        A
        )
      
    
    {\displaystyle \mathbb {P} [X_{n}\in A]=\mu (A)}

<p>for any 
  
    
      
        A
        ∈
        Σ
      
    
    {\displaystyle A\in \Sigma }
  
.
</p>
<h3>Reversibility</h3>
<p>A Markov kernel 
  
    
      
        p
      
    
    {\displaystyle p}
  
 is called reversible according to a probability measure 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
 if
</p>

∫
          
            A
          
        
        p
        (
        x
        ,
        B
        )
        
        μ
        (
        d
        x
        )
        =
        
          ∫
          
            B
          
        
        p
        (
        x
        ,
        A
        )
        
        μ
        (
        d
        x
        )
      
    
    {\displaystyle \int _{A}p(x,B)\,\mu (dx)=\int _{B}p(x,A)\,\mu (dx)}

<p>holds for any 
  
    
      
        A
        ,
        B
        ∈
        Σ
      
    
    {\displaystyle A,B\in \Sigma }
  
.
Replacing 
  
    
      
        A
        =
        E
      
    
    {\displaystyle A=E}
  
 shows that if 
  
    
      
        p
      
    
    {\displaystyle p}
  
 is reversible according to 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
, then 
  
    
      
        μ
      
    
    {\displaystyle \mu }
  
 must be a stationary measure of 
  
    
      
        p
      
    
    {\displaystyle p}
  
.
</p>
<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Harris_chain/gjKEG4Hc">Harris chain</a></li>
<li><a href="/facts/Subshift_of_finite_type/RsnmudER">Subshift of finite type</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1"><p>Joseph L. Doob: Stochastic Processes. New York: John Wiley & Sons, 1953. <a href="#fnref:1" class="footnote-back-ref">↩</a></p></li>
<li id="fn:2"><p>Kai L. Chung: Markov Chains with Stationary Transition Probabilities. Second edition. Berlin: Springer-Verlag, 1974. <a href="#fnref:2" class="footnote-back-ref">↩</a></p></li>
<li id="fn:3"><p>Sean Meyn and Richard L. Tweedie: Markov Chains and Stochastic Stability. 2nd edition, 2009. <a href="#fnref:3" class="footnote-back-ref">↩</a></p></li>
<li id="fn:4"><p>Daniel Revuz: Markov Chains. 2nd edition, 1984. <a href="#fnref:4" class="footnote-back-ref">↩</a></p></li>
<li id="fn:5"><p>Rick Durrett: Probability: Theory and Examples. Fourth edition, 2005. <a href="#fnref:5" class="footnote-back-ref">↩</a></p></li>
<li id="fn:6"><p>Daniel Revuz: Markov Chains. 2nd edition, 1984. <a href="#fnref:6" class="footnote-back-ref">↩</a></p></li>
<li id="fn:7"><p>Daniel Revuz: Markov Chains. 2nd edition, 1984. <a href="#fnref:7" class="footnote-back-ref">↩</a></p></li>
</ol>

Markov chains on a measurable state space open-in-new

Markov chains on a measurable state space