# MCMC

### Introduction

Markov Chain Monte-Carlo (MCMC) is a method of sampling from a distribution, when it is difficult to sample from the distribution directly. It can be used with Monte-Carlo Integration for numerical calculation of integrals.

It explores the state space using a Markov Chain mechanism, and produces samples which mimic samples drawn from the distribution directly. We assume that we can evaluate up to a normalizing constant, but it is difficult to draw samples from it.

Let us consider the situation when the number of possible states in a Markov chain is . Then we can choose an initial distribution and have a transition matrix . Following the transition matrix at the first step, we get the distribution . If we repeat the process several times and it converges to a distribution , we say it converged to an invariant distribution. The process converges if the matrix has properties of irreducibility and aperiodicity.

**Definition 1.** A transition matrix is *irreducible* if for any state of the Markov chain there is a positive probability of visiting all other states.

**Definition 2.** A transition matrix is *aperiodical* if the Markov chain does not get trapped into cycles.

A sufficient condition for the transition matrix to satisfy these requirenments is the *reversibility* condition:

In this case here is the invariant distribution. Indeed, summing both sides over gives us

In continuous state spaces the transition matrix becomes an integral kernel and becomes the corresponding eigenfunction.

MCMC samples are organized in a way that the desired is the invariant distribution:

### The Metropolis-Hastings algorithm

Let us have a proposal distribution and invariant distribution . An MH step involves sampling from the proposal distribution a candidate value . The Markov chain then moves towards with acceptance probability , otherwise it remains at .

If the MH algorithm transforms to the Metropolis algorithm.

### Integration using MCMC

To calculate the integral it is possible to sample from the uniform distribution and calculate the integral using Monte-Carlo integration. But MCMC method of sampling from may lead to much better convergence. First one needs to perform *burn-in* state, when a Markov chain finds a way to sample from a distribution close to using MH algorithm. Then search for the distribution continues, but the sum is calculated. The resulting approximation of the integral is , where is the length of the burn-in stage, and is the length of the sampling stage.

MCMC is also used to solve optimization problems like by applying a so-called simulated annealing technique.

### Bibliography

- Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E. Equations of State Calculations by Fast Computing Machines.
*Journal of Chemical Physics*,**21**(6): 1087–1092 (1953). - Hastings, W.K. Monte Carlo sampling methods using Markov chains and their applications.
*Biometrika*,**57**(1):97–109 (1970). - Christophe Andrieu et al. An Introduction to MCMC for Machine Learning,
*Machine Learning*,**50**: 5–43 (2003).