StatlectThe Digital Textbook
Index > Fundamentals of probability

Linear correlation

Linear correlation is a measure of dependence between two random variables.

Table of Contents

Definition

Let X and Y be two random variables. The linear correlation coefficient (or Pearson's correlation coefficient) between X and Y, denoted by [eq1] or by $
ho _{XY}$, is defined as follows:[eq2]where [eq3] is the covariance between X and Y and [eq4] and [eq5] are the standard deviations of X and Y.

The linear correlation coefficient is well-defined only as long as [eq6], [eq7] and [eq8] exist and are well-defined.

Note that, in principle, the ratio is well-defined only if [eq9] and [eq5] are strictly greater than zero. However, it is often assumed that [eq11] when one of the two standard deviations is zero. This is equivalent to assuming that [eq12] because [eq13] when one of the two standard deviations is zero.

Interpretation

The interpretation is similar to the interpretation of covariance: the correlation between X and Y provides a measure of how similar their deviations from the respective means are (see the lecture entitled Covariance for a detailed explanation).

Linear correlation has the property of being bounded between $-1$ and 1:[eq14]

Thanks to this property, correlation allows to easily understand the intensity of the linear dependence between two random variables: the closer correlation is to 1, the stronger the positive linear dependence between $X $ and Y is (and the closer it is to $-1$, the stronger the negative linear dependence between X and Y is).

Terminology

The following terminology is often used:

  1. If [eq15] then X and Y are said to be positively linearly correlated (or simply positively correlated).

  2. If [eq16] then X and Y are said to be negatively linearly correlated (or simply negatively correlated).

  3. If [eq17] then X and Y are said to be linearly correlated (or simply correlated).

  4. If [eq18] then X and Y are said to be uncorrelated. Also note that [eq19] $=0$, therefore two random variables X and Y are uncorrelated whenever [eq20].

Example

The following example shows how to compute the coefficient of linear correlation between two discrete random variables.

Example Let X be a $2$-dimensional random vector and denote its components by X_1 and X_2. Let the support of X be [eq21]and its joint probability mass function be[eq22]The support of X_1 is[eq23]and its probability mass function is[eq24]The expected value of X_1 is[eq25]The expected value of $X_{1}^{2}$ is[eq26]The variance of X_1 is[eq27]The standard deviation of X_1 is:[eq28]The support of X_2 is:[eq29]and its probability mass function is[eq30]The expected value of X_2 is[eq31]The expected value of $X_{2}^{2}$ is[eq32]The variance of X_2 is[eq33]The standard deviation of X_2 is[eq34]Using the transformation theorem, we can compute the expected value of $X_{1}X_{2}$:[eq35]Hence, the covariance between X_1 and X_2 is[eq36]and the linear correlation coefficient is:[eq37]

More details

The following sections contain more details about the linear correlation coefficient.

Correlation of a random variable with itself

Let X be a random variable, then[eq38]

Proof

This is proved as follows:[eq39]where we have used the fact that[eq40]

Symmetry

The linear correlation coefficient is symmetric:[eq41]

Proof

This is proved as follows:[eq42]where we have used the fact that covariance is symmetric:[eq43]

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Let X be a $2	imes 1$ discrete random vector and denote its components by X_1 and X_2. Let the support of X be[eq44]and its joint probability mass function be[eq45]

Compute the coefficient of linear correlation between X_1 and X_2.

Solution

The support of X_1 is[eq46]and its marginal probability mass function is[eq47]The expected value of X_1 is[eq48]The expected value of $X_{1}^{2}$ is[eq49]The variance of X_1 is[eq50]The standard deviation of X_1 is[eq51]The support of X_2 is[eq52]and its marginal probability mass function is[eq53]The expected value of X_2 is[eq54]The expected value of $X_{2}^{2}$ is[eq55]The variance of X_2 is[eq56]The standard deviation of X_1 is[eq57]Using the transformation theorem, we can compute the expected value of $X_{1}X_{2}$:[eq58]Hence, the covariance between X_1 and X_2 is[eq59]and the coefficient of linear correlation between X_1 and X_2 is[eq60]

Exercise 2

Let X be a $2	imes 1$ discrete random vector and denote its entries by X_1 and X_2. Let the support of X be[eq61]and its joint probability mass function be[eq62]

Compute the coefficient of linear correlation between X_1 and X_2.

Solution

The support of X_1 is[eq63]and its marginal probability mass function is[eq64]The mean of X_1 is[eq65]The expected value of $X_{1}^{2}$ is[eq66]The variance of X_1 is[eq67]The standard deviation of X_1 is[eq68]The support of X_2 is[eq69]and its probability mass function is[eq70]The mean of X_2 is[eq71]The expected value of $X_{2}^{2}$ is[eq72]The variance of X_2 is[eq73]The standard deviation of X_2 is[eq74]The expected value of the product $X_{1}X_{2}$ can be derived using the transformation theorem[eq75]Therefore, putting pieces together, the covariance between X_1 and $X_{2} $ is[eq76]and the coefficient of linear correlation between X_1 and X_2 is[eq77]

Exercise 3

Let [eq78] be an absolutely continuous random vector with support [eq79]and let its joint probability density function be[eq80]Compute the coefficient of linear correlation between X and Y.

Solution

The support of Y is[eq81]When $y
otin R_{Y}$, the marginal probability density function of Y is 0, while, when $yin R_{Y}$, the marginal probability density function of Y can be obtained by integrating x out of the joint probability density as follows:[eq82]Thus, the marginal probability density function of Y is[eq83]The expected value of Y is[eq84]The expected value of $Y^{2}$ is[eq85]The variance of Y is[eq86]The standard deviation of Y is[eq87]The support of X is[eq88]When $x
otin R_{X}$, the marginal probability density function of X is 0, while, when $xin R_{X}$, the marginal probability density function of X can be obtained by integrating $y$ out of the joint probability density as follows:[eq89]We do not explicitly compute the integral, but we write the marginal probability density function of X as follows:[eq90]The expected value of X is[eq91]The expected value of $X^{2}$ is[eq92]The variance of X is[eq93]The standard deviation of X is[eq94]The expected value of the product $XY$ can be computed by using the transformation theorem:[eq95]Hence, by the covariance formula, the covariance between X and Y is[eq96]and the coefficient of linear correlation between X and Y is[eq97]

The book

Most of the learning materials found on this website are now available in a traditional textbook format.