StatlectThe Digital Textbook
Index > Asymptotic theory

Sequences of random variables and their convergence

One of the central topics in probability theory and statistics is the study of sequences of random variables, that is, of sequences [eq1] whose generic element X_n is a random variable.

There are several reasons why sequences of random variables are important:

  1. Often, we need to analyze a random variable X, but for some reasons X is too complex to analyze directly. What we usually do in this case is to approximate X by simpler random variables X_n that are easier to study; these approximating random variables are arranged into a sequence [eq2] and they become better and better approximations of X as n increases. This is exactly what we did when we introduced the Lebesgue integral.

  2. In statistical theory, X_n is often an estimate of an unknown quantity whose value and whose properties depend on the sample size n (the sample size is the number of observations used to compute the estimate). Usually, we are able to analyze the properties of X_n only asymptotically (i.e. when n tends to infinity). In this case, [eq3] is a sequence of estimates and we analyze the properties of the limit of [eq1], in the hope that a large sample (the one we observe) and an infinite sample (the one we analyze by taking the limit of X_n) have a similar behavior.

  3. In many applications a random variable is observed repeatedly through time (for example, the price of a stock is observed every day). In this case [eq1] is the sequence of observations on the random variable and n is a time-index (in the stock price example, X_n is the price observed in the n-th period).

Table of Contents

Terminology

In this lecture, we introduce some terminology related to sequences of random variables.

Realization of a sequence

Let [eq6] be a sequence of real numbers and [eq7] a sequence of random variables. If the real number $x_{n}$ is a realization of the random variable X_n for every n, then we say that the sequence of real numbers [eq8] is a realization of the sequence of random variables [eq1] and we write[eq10]

Sequences on a sample space

Let Omega be a sample space. Let [eq1] be a sequence of random variables. We say that [eq1] is a sequence of random variables defined on the sample space Omega if all the random variables X_n belonging to the sequence [eq1] are functions from Omega to R.

Independent sequences

Let [eq1] be a sequence of random variables defined on a sample space Omega. We say that [eq1] is an independent sequence of random variables (or a sequence of independent random variables) if every finite subset of [eq1] (i.e., every finite set of random variables belonging to the sequence) is a set of mutually independent random variables.

Identically distributed sequences

Let [eq1] be a sequence of random variables. Denote by [eq18] the distribution function of a generic element of the sequence X_n. We say that [eq7] is a sequence of identically distributed random variables if any two elements of the sequence have the same distribution function:[eq20]

IID sequences

Let [eq1] be a sequence of random variables defined on a sample space Omega. We say that [eq1] is a sequence of independent and identically distributed random variables (or an IID sequence of random variables), if [eq1] is both a sequence of independent random variables and a sequence of identically distributed random variables.

Stationary sequences

Let [eq1] be a sequence of random variables defined on a sample space Omega. Take a first group of $q$ successive terms of the sequence $X_{n+1}$, ..., $X_{n+q}$. Now take a second group of $q$ successive terms of the sequence $X_{n+k+1}$, ..., $X_{n+k+q}$. The second group is located k positions after the first group. Denote the joint distribution function of the first group of terms by[eq25]and the joint distribution function of the second group of terms by[eq26]

The sequence [eq1] is said to be stationary (or strictly stationary) if and only if[eq28]for any $n,k,qin U{2115} $ and for any vector [eq29].

In other words, a sequence is strictly stationary if and only if the two random vectors [eq30] and [eq31] have the same distribution (for any n, k and $q$). Requiring strict stationarity is weaker than requiring that a sequence be IID (see IID sequences above): if [eq1] is an IID sequence, then it is also strictly stationary, while the converse is not necessarily true.

Weakly stationary sequences

Let [eq1] be a sequence of random variables defined on a sample space Omega. We say that [eq1] is a covariance stationary sequence (or weakly stationary sequence) if[eq35]where n and $j$ are, of course, integers.

Property (1) means that all the random variables belonging to the sequence [eq36] have the same mean.

Property (2) means that the covariance between a term X_n of the sequence and the term that is located $j$ positions before it ($X_{n-j}$) is always the same, irrespective of how X_n has been chosen. In other words, [eq37] depends only on $j$ and not on n. Note also that property (2) implies that all the random variables in the sequence have the same variance (remember that [eq38]):[eq39]

Note that strictly stationarity (see above) implies weak stationarity only if the mean [eq40] and all the covariances [eq41] exist and are finite. Obviously, covariance stationarity does not imply strict stationarity (the former imposes restrictions only on the first and second moments, while the latter imposes restrictions on the whole distribution).

Mixing sequences

Let [eq1] be a sequence of random variables defined on a sample space Omega. Intuitively, [eq1] is a mixing sequence if any two groups of terms of the sequence that are far apart from each other are approximately independent (and the further the closer to being independent).

Take a first group of $q$ successive terms of the sequence $X_{n+1}$, ..., $X_{n+q}$. Now take a second group of $q$ successive terms of the sequence $X_{n+k+1}$, ..., $X_{n+k+q}$. The second group is located k positions after the first group. The two groups of terms are independent if and only if [eq44]for any two functions $f$ and $g$. This is just the definition of independence between the two random vectors [eq45] and [eq46] (see Mutually independent random vectors). Trivially, the above condition can be written as[eq47]

If this condition is true asymptotically (i.e., when [eq48]), then we say that the sequence [eq1] is mixing.

Definition We say that a sequence of random variables [eq1] is mixing (or strongly mixing) if and only if[eq51]for any two functions $f$ and $g$ and for any n and $q$.

In other words, a sequence is strongly mixing if and only if the two random vectors [eq30] and [eq53] tend to become more and more independent by increasing k (for any n and $q$). This is a milder requirement than the requirement of independence (see Independent sequences above): if [eq1] is an independent sequence, all its terms are independent from one another; if [eq1] is a mixing sequence, its terms can be dependent, but they become less and less dependent as the distance between their locations in the sequence increases. Of course, an independent sequence is also a mixing sequence, while the converse is not necessarily true.

Ergodic sequences

In this section we discuss ergodicity. Roughly speaking, ergodicity is a weak concept of independence for sequences of random variables.

In the subsections above we have discussed other two concepts of independence for sequences of random variables:

  1. independent sequences are sequences of random variables whose terms are mutually independent;

  2. mixing sequences are sequences of random variables whose terms can be dependent but become less and less dependent as their distance increases (by distance we mean how far apart they are located in the sequence).

Requiring that a sequence be mixing is weaker than requiring that a sequence be independent: in fact, an independent sequence is also mixing, but the converse is not true.

Requiring that a sequence be ergodic is even weaker than requiring that a sequence be mixing (mixing implies ergodicity but not vice versa). This is probably all you need to know if you are not studying asymptotic theory at an advanced level, because ergodicity is quite a complicated topic and the definition of ergodicity is fairly abstract. Nevertheless, we give here a quick definition of ergodicity for the sake of completeness.

Denote by [eq56] the set of all possible sequences of real numbers. When [eq57] is a sequence of real numbers, denote by [eq58] the subsequence obtained by dropping the first term of [eq59], that is,[eq60]

We say that a subset [eq61] is a shift invariant set if and only if [eq62] belongs to A whenever [eq59] belongs to A.

Definition A set [eq61] is shift invariant if and only if[eq65]

Shift invariance is used to define ergodicity:

Definition A sequence of random variables [eq1] is said to be an ergodic sequence if an olny if[eq67]whenever A is a shift invariant set.

Convergence

As we explained in the lecture entitled Limit of a sequence, whenever we want to assess whether a sequence is convergent to a limit, we need to define a distance function (or metric) to measure the distance between the terms of the sequence. Intuitively, a sequence converges to a limit if, by dropping a sufficiently high number of initial terms of the sequence, the remaining terms can be made as close to each other as we wish. The problem is how to define "close to each other". As we have explained, the concept of "close to each other" can be made fully rigorous by using the notion of a metric. Therefore, discussing convergence of a sequence of random variables boils down to discussing what metrics can be used to measure the distance between two random variables.

Modes of convergence

In the following lectures, we introduce several different notions of convergence of a sequence of random variables: to each different notion corresponds a different way of measuring the distance between two random variables.

The notions of convergence (also called modes of convergence) introduced in the following lectures are:

  1. Pointwise convergence

  2. Almost sure convergence

  3. Convergence in probability

  4. Mean-square convergence.

  5. Convergence in distribution

The book

Most of the learning materials found on this website are now available in a traditional textbook format.