Covariance is a measure of association between two random variables.
Table of contents
Let us start with a definition of covariance.
Definition The covariance between two random variables and , denoted by , is defined asprovided the above expected values exist and are well-defined.
In order to better to better understand the definition of covariance, let us analyze how it is constructed.
Covariance is the expected value of the product , where and are defined as follows: and are the deviations of and from their respective means.
When is positive, it means that:
either and are both above their respective means;
or and are both below their respective means.
On the contrary, when is negative, it means that:
either is above its mean and is below its mean;
or is below its mean and is above its mean.
In other words, when is positive, and are concordant (their deviations from the mean have the same sign); when is negative, and are discordant (their deviations from the mean have opposite signs).
Thus, the product can be interpreted as a measure of similarity between and (actually, the product is a measure of similarity). As a consequence, the covariance tells us how similar the deviations of the two variables (from their respective means) are on average. Intuitively, we could express the concept as follows:
When , and do not display any of the above two tendencies.
The covariance between two random variables can also be defined by the formulawhich is equivalent to the formula in the definition above.
The equivalence of the two definitions is proved as follows:
It is easy to see from this formula that the covariance between and exists and is well-defined only as long as the expected values , and exist and are well-defined.
This formula is of great practical relevance and it is used very often in these lectures. It will be often referred to as covariance formula.
The following example shows how to compute the covariance between two discrete random variables.
Example Let be a random vector and denote its components by and . Let the support of be and its joint probability mass function beThe support of isand its marginal probability mass function isThe expected value of isThe support of isand its marginal probability mass function isThe expected value of isUsing the transformation theorem, we can compute the expected value of :Hence, the covariance between and is
More examples, including examples of how to compute the covariance between two continuous random variables, can be found in the solved exercises at the bottom of this page.
The following subsections contain more details on covariance.
Let be a random variable, then
It descends from the definition of variance:
The covariance operator is symmetric:
By the definition of covariance, we have
Let and be two random variables. Then the variance of their sum is
The above formula is derived as follows:
Thus, to compute the variance of the sum of two random variables we need to know their covariance.
Obviously then, the formulaholds only when and have zero covariance.
The formula for the variance of a sum of two random variables can be generalized to sums of more than two random variables (see variance of the sum of n random variables).
The covariance operator is linear in both of its arguments. Let , and be three random variables and let and be two constants. Then, the first argument is linear:
This is proved by using the linearity of the expected value:
By symmetry, also the second argument is linear:
Linearity in both the first and second argument is called bilinearity.
By iteratively applying the above arguments, one can prove that bilinearity holds also for linear combinations of more than two variables:
The variance of the sum of random variables is
This is demonstrated using the bilinearity of the covariance operator (see above):
This formula implies that when all the random variables in the sum have zero covariance with each other, then the variance of the sum is just the sum of the variances:This is true, for example, when the random variables in the sum are mutually independent (because independence implies zero covariance).
Below you can find some exercises with explained solutions.
Let be a discrete random vector and denote its components by and . Let the support of be and its joint probability mass function be
Compute the covariance between and .
The support of isand its marginal probability mass function isThe expected value of isThe support of isand its marginal probability mass function isThe expected value of isBy using the transformation theorem, we can compute the expected value of :Hence, the covariance between and is
Let be a discrete random vector and denote its entries by and . Let the support of beand its joint probability mass function be
Compute the covariance between and .
The support of isand its marginal probability mass function isThe mean of isThe support of isand its probability mass function isThe mean of isThe expected value of the product can be derived by using the transformation theorem:Therefore, by putting pieces together, we obtain that the covariance between and is
Let and be two random variables such that
Compute the following covariance:
By the bilinearity of the covariance operator, we have
Let be a continuous random vector with support: In other words, is the set of all couples such that and . Let the joint probability density function of beCompute the covariance between and .
The support of isthus, when , the marginal probability density function of is , while, when , the marginal probability density function of isTherefore, the marginal probability density function of isThe expected value of isThe support of isWhen , the marginal probability density function of is , while, when , the marginal probability density function of isTherefore, the marginal probability density function of isThe expected value of is:The expected value of the product can be computed thanks to the transformation theorem:Hence, by the covariance formula, the covariance between and is
Let be a continuous random vector with support and its joint probability density function beCompute the covariance between and .
The support of isWhen , the marginal probability density function of is , while, when , the marginal probability density function of isBy putting pieces together, we have that the marginal probability density function of isThe expected value of isThe support of isWhen , the marginal probability density function of is , while, when , the marginal probability density function of is:We do not explicitly compute the integral, but we write the marginal probability density function of as follows:The expected value of isThe expected value of the product can be computed thanks to the transformation theorem:Hence, the covariance formula gives
Let and be two random variables such that
Compute the following covariance:
By the bilinearity of the covariance operator, we have that
Please cite as:
Taboga, Marco (2021). "Covariance", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-probability/covariance.
Most of the learning materials found on this website are now available in a traditional textbook format.