One of the central topics in probability theory and statistics is the study of sequences of random variables, that is, of sequences whose generic element is a random variable.
There are several reasons why sequences of random variables are important:
Often, we need to analyze a random variable , but for some reasons is too complex to analyze directly. What we usually do in this case is to approximate by simpler random variables that are easier to study; these approximating random variables are arranged into a sequence and they become better and better approximations of as increases. This is exactly what we did when we introduced the Lebesgue integral.
In statistical theory, is often an estimate of an unknown quantity whose value and whose properties depend on the sample size (the sample size is the number of observations used to compute the estimate). Usually, we are able to analyze the properties of only asymptotically (i.e. when tends to infinity). In this case, is a sequence of estimates and we analyze the properties of the limit of , in the hope that a large sample (the one we observe) and an infinite sample (the one we analyze by taking the limit of ) have a similar behavior.
In many applications a random variable is observed repeatedly through time (for example, the price of a stock is observed every day). In this case is the sequence of observations on the random variable and is a time-index (in the stock price example, is the price observed in the -th period).
In this lecture, we introduce some terminology related to sequences of random variables.
Let be a sequence of real numbers and a sequence of random variables. If the real number is a realization of the random variable for every , then we say that the sequence of real numbers is a realization of the sequence of random variables and we write
Let be a sample space. Let be a sequence of random variables. We say that is a sequence of random variables defined on the sample space if all the random variables belonging to the sequence are functions from to .
Let be a sequence of random variables defined on a sample space . We say that is an independent sequence of random variables (or a sequence of independent random variables) if every finite subset of (i.e., every finite set of random variables belonging to the sequence) is a set of mutually independent random variables.
Let be a sequence of random variables. Denote by the distribution function of a generic element of the sequence . We say that is a sequence of identically distributed random variables if any two elements of the sequence have the same distribution function:
Let be a sequence of random variables defined on a sample space . We say that is a sequence of independent and identically distributed random variables (or an IID sequence of random variables), if is both a sequence of independent random variables and a sequence of identically distributed random variables.
Let be a sequence of random variables defined on a sample space . Take a first group of successive terms of the sequence , ..., . Now take a second group of successive terms of the sequence , ..., . The second group is located positions after the first group. Denote the joint distribution function of the first group of terms byand the joint distribution function of the second group of terms by
The sequence is said to be stationary (or strictly stationary) if and only iffor any and for any vector .
In other words, a sequence is strictly stationary if and only if the two random vectors and have the same distribution (for any , and ). Requiring strict stationarity is weaker than requiring that a sequence be IID (see IID sequences above): if is an IID sequence, then it is also strictly stationary, while the converse is not necessarily true.
Let be a sequence of random variables defined on a sample space . We say that is a covariance stationary sequence (or weakly stationary sequence) ifwhere and are, of course, integers.
Property (1) means that all the random variables belonging to the sequence have the same mean.
Property (2) means that the covariance between a term of the sequence and the term that is located positions before it () is always the same, irrespective of how has been chosen. In other words, depends only on and not on . Note also that property (2) implies that all the random variables in the sequence have the same variance (remember that ):
Note that strictly stationarity (see above) implies weak stationarity only if the mean and all the covariances exist and are finite. Obviously, covariance stationarity does not imply strict stationarity (the former imposes restrictions only on the first and second moments, while the latter imposes restrictions on the whole distribution).
Let be a sequence of random variables defined on a sample space . Intuitively, is a mixing sequence if any two groups of terms of the sequence that are far apart from each other are approximately independent (and the further the closer to being independent).
Take a first group of successive terms of the sequence , ..., . Now take a second group of successive terms of the sequence , ..., . The second group is located positions after the first group. The two groups of terms are independent if and only if for any two functions and . This is just the definition of independence between the two random vectors and (see Mutually independent random vectors). Trivially, the above condition can be written as
If this condition is true asymptotically (i.e., when ), then we say that the sequence is mixing.
Definition We say that a sequence of random variables is mixing (or strongly mixing) if and only iffor any two functions and and for any and .
In other words, a sequence is strongly mixing if and only if the two random vectors and tend to become more and more independent by increasing (for any and ). This is a milder requirement than the requirement of independence (see Independent sequences above): if is an independent sequence, all its terms are independent from one another; if is a mixing sequence, its terms can be dependent, but they become less and less dependent as the distance between their locations in the sequence increases. Of course, an independent sequence is also a mixing sequence, while the converse is not necessarily true.
In this section we discuss ergodicity. Roughly speaking, ergodicity is a weak concept of independence for sequences of random variables.
In the subsections above we have discussed other two concepts of independence for sequences of random variables:
independent sequences are sequences of random variables whose terms are mutually independent;
mixing sequences are sequences of random variables whose terms can be dependent but become less and less dependent as their distance increases (by distance we mean how far apart they are located in the sequence).
Requiring that a sequence be mixing is weaker than requiring that a sequence be independent: in fact, an independent sequence is also mixing, but the converse is not true.
Requiring that a sequence be ergodic is even weaker than requiring that a sequence be mixing (mixing implies ergodicity but not vice versa). This is probably all you need to know if you are not studying asymptotic theory at an advanced level, because ergodicity is quite a complicated topic and the definition of ergodicity is fairly abstract. Nevertheless, we give here a quick definition of ergodicity for the sake of completeness.
Denote by the set of all possible sequences of real numbers. When is a sequence of real numbers, denote by the subsequence obtained by dropping the first term of , that is,
We say that a subset is a shift invariant set if and only if belongs to whenever belongs to .
Definition A set is shift invariant if and only if
Shift invariance is used to define ergodicity:
Definition A sequence of random variables is said to be an ergodic sequence if an olny ifwhenever is a shift invariant set.
As we explained in the lecture entitled Limit of a sequence, whenever we want to assess whether a sequence is convergent to a limit, we need to define a distance function (or metric) to measure the distance between the terms of the sequence. Intuitively, a sequence converges to a limit if, by dropping a sufficiently high number of initial terms of the sequence, the remaining terms can be made as close to each other as we wish. The problem is how to define "close to each other". As we have explained, the concept of "close to each other" can be made fully rigorous by using the notion of a metric. Therefore, discussing convergence of a sequence of random variables boils down to discussing what metrics can be used to measure the distance between two random variables.
In the following lectures, we introduce several different notions of convergence of a sequence of random variables: to each different notion corresponds a different way of measuring the distance between two random variables.
The notions of convergence (also called modes of convergence) introduced in the following lectures are:
Most of the learning materials found on this website are now available in a traditional textbook format.