StatLect
Index > Asymptotic theory

The Monte Carlo method

The Monte Carlo method is a computational method that consists in using a computer-generated sample from a given probability distribution to produce a plug-in estimate of some feature of the given distribution (such as, for example, a moment or a quantile).

In what follows, we are often going to refer to the plug-in principle. If you are not familiar with the concept, you are advised to first read the lecture entitled Plug-in principle.

Table of Contents

Monte Carlo approximation

Let X be a random variable with distribution function [eq1] and suppose that we can generate (through a computer) a sample [eq2]of realizations of n random variables X_1, ..., X_n all having distribution function $F_{X}$.

Denote by [eq3]a feature of the distribution $F_{X}$ (e.g., its mean, its variance, or one of its quantiles).

Denote by [eq4] the empirical distribution of the sample $xi _{n}$ (i.e., a probability distribution that assigns probability $1/n$ to each one of the values [eq5]).

Then, the plug-in estimate[eq6]is a Monte Carlo approximation of [eq7].

In other words, the feature [eq7] of the orignal distribution is approximated by the same feature [eq9] of the empirical distribution of the computer-generated sample.

Example Let X be a random variable having distribution function $F_{X}$ and let [eq10] be a function. Suppose we want to approximate [eq11]where we have written the expected value of g(x) as a Riemann-Stieltjes integral (see the lecture entitled Expected value) in order to highlight the fact that it depends on the distribution function $F_{X}$. Now, suppose we have a computer-generated sample of n draws $x_{1}$, ..., $x_{n}$ from the distribution $F_{X}$. Denote their empirical distribution function by $F_{n}$. Then, the Monte Carlo approximation of [eq12] is[eq13]where we have used the facts that the empirical distribution $F_{n}$ is discrete and the expected value of a discrete random variable is the weighted sum of the values it can take, with weights equal to their respective probabilities (in this case the weights are all equal to $1/n$ ). In other words, we can approximate the expected value [eq14] with the sample mean of the computer-generated draws [eq15].

Monte Carlo integration

When the Monte Carlo method is used to approximate an expected value (as in the previous example), then the method is called Monte Carlo integration. It consists in approximating the expected value of a random variable X with the mean of a computer-generated sample of observations drawn at random from the distribution of X:[eq16]where $x_{1}$,...,$x_{n}$ are the computer-generated draws from the distribution of X. If the draws are IID, then Kolmogorov's Strong Law of Large Numbers applies and the approximation converges almost surely to [eq17] as the number of draws n becomes large. It is also possible, however, that the draws are not IID (as, for instance, in the so-called Markov Chain Monte Carlo methods), but the approximation converges anyway to [eq18] because a Law of Large Numbers for correlated sequences applies (see the lecture entitled Laws of Large Numbers for more details).

This approximation method is called Monte Carlo integration because the expected value being approximated is in fact an integral:[eq19]where [eq20] is the distribution function of X, and the integral is a Riemann-Stieltjes integral. Moreover, if X is a continuous variable, the integral can be written as an ordinary Riemann integral:[eq21]where [eq22] is the probability density function of X.

It is very important to note that the method of Monte Carlo integration is often used when the distribution function of X is not known analytically, but X can be written as a function $gleft( Z
ight) $ of a random vector Z and it is easy to produce computer-generated draws of Z. In such a case, n draws $z_{1}$,...,$z_{n}$ are generated from the distribution of Z, which are then transformed into n draws from the distribution of $X $:[eq23]and the expected value of X is approximated by[eq24]

Monte Carlo integration is also used to compute integrals that have no particular probabilistic interpretation but can be written as expected values. Suppose the integral to be computed is[eq25]where [eq26] is any integrable function. Now, take any legitimate probability density function $fleft( x
ight) $ such that [eq27] for $aleq xleq b$ and [eq28] for $x<a$ or $x>b$. Then we can write[eq29]where X is a random variable having probability density function $fleft( x
ight) $. If we are able to produce a computer-generated sample of draws $x_{1}$,...,$x_{n}$ from the distribution of X, then the integral can be approximated as follows:[eq30]

The computer-generated sample

Up to this point, we have not clarified what we mean by a computer-generated sample of draws from a given distribution. While we will not go into details, we would like to highlight some facts.

First of all, there exist several algorithms that allow to use a computer to produce sequences of numbers, called pseudo-random numbers, which are not truly random, but approximate fairly well the properties of a truly random sequence. If a statistician observed sequences of numbers produced by these algorithms without knowing how they were produced, she would conclude that they are indeed random (even after carrying out rigorous statistical tests to check their randomness). See Wikipedia's article on pseudo-random number generators for more details.

Second, most available algorithms for producing sequences of pseudo-random numbers produce sequences of independent draws from a uniform distribution on the interval $left[ 0,1
ight] $. These sequences of uniform random numbers are then transformed in some way in order to obtain sequences drawn from other distributions. For example, if $U$ is a pseudo-random number having a uniform distribution on $left[ 0,1
ight] $ and $F_{X}$ is an invertible distribution function, then the number[eq31]has distribution function $F_{X}$.

Proof

We have[eq32]

This method of generating pseudo-random numbers from non-uniform distributions is called inverse transformation method. Lots of other methods exist (see, e.g., Wikipedia's article on non-uniform pseudo-random variate generation).

Third, all commonly used statistical software packages include efficient and thoroughly tested pseudo-random number generators for extracting random samples from the most important probability distributions. So, if you need samples of observations from these distributions to compute a Monte Carlo approximation, all you need to do is to find out how to use the pseudo-random number generator that comes with your favorite statistical software.

Approximation errors

By approximating a quantity of interest [eq7] with its Monte Carlo approximation [eq9], we commit an approximation error[eq35]

In general, the properties of this approximation error (mean, variance, asymptotic convergence) depend on the properties of the plug-in estimator [eq36]. But, as noted in the lecture on the plug-in principle, these properties can be difficult to study in general terms. However, things are relatively simple in the particular case of Monte Carlo integration, that is, when $T$ is an integral (an expected value). In this case,[eq37]and[eq38]

When X_1,...,X_n are independent and have the same distribution of X, then [eq9] converges almost surely to [eq40] by Kolmogorov's Strong Law of Large Numbers (provided [eq41]). But this implies that the approximation error converges to zero as the sample size n gets large. Furthermore, the expected value and the variance of the approximation error can be easily computed:[eq42]

Proof

The expected value is[eq43]and the variance is[eq44]

From these formulae it is clear that the variance of the approximation error can be made as small as desired by increasing the sample size n of the computer-generated sample (at the cost of increasing the computational intensity of the approximation, that is, the computer time and memory required to compute the approximation).

Note also that, since [eq45] is usually not known, it can be estimated by the sample variance[eq46]and the variance of the approximation error is estimated by[eq47]

Finally, let us mention that there exist several techniques that can be used to reduce the variance of a Monte Carlo approximation. These techniques are called variance reduction techniques. One of the most popular of these techniques is introduced in the lecture entitled Importance sampling.

The book

Most of the learning materials found on this website are now available in a traditional textbook format.