Search for probability and statistics terms on Statlect
StatLect

Variance inflation factor

by , PhD

In regression analysis, the variance inflation factor (VIF) is a measure of the degree of multicollinearity of one regressor with the other regressors.

Table of Contents

Multicollinearity

Multicollinearity arises when a regressor is very similar to a linear combination of other regressors.

Multicollinearity has the effect of markedly increasing the variance of regression coefficient estimates. Therefore, we usually try to avoid it as much as possible.

To detect and measure multicollinearity, we use the so-called variance inflation factors.

The variance inflation factor is used to detect multicollinearity, a problem which inflates the variance of regression coefficient estimates.

The linear regression

Consider the linear regression[eq1]where:

Matrix form

The linear regression can be written in matrix form as:[eq4]where:

The OLS estimator

If the design matrix X has full rank, then we can compute the ordinary least squares (OLS) estimator of the vector of regression coefficients $eta $ as follows:[eq5]

The variance of the coefficients

Under certain assumptions (see, e.g., the lecture on the Gauss-Markov theorem), the covariance matrix of the OLS estimator is[eq6]

Therefore, the variance of the OLS estimator of a single coefficient is[eq7]where [eq8] is the k-th entry on the main diagonal of [eq9].

A convenient expression for the variance

If the k-th regressor has zero mean, we can write the variance of its estimated coefficient as[eq10]where $R_{k}^{2}$ is the R squared obtained by regressing the k-th regressor on all the other regressors.

Proof

Without loss of generality, suppose that $k=1$ (otherwise, change the order of the regressors). We can write the design matrix X as a block matrix:[eq11]where $X_{ullet ,1}$ is the first column of X and the block $X_{ullet ,-1}$ contains all the other columns. Then, we have[eq12]We use Schur complements, and in particular the formula[eq13]to write the first entry of the inverse of $X^{	op }X$ as:[eq14]As proved in the lecture on partitioned regressions, the matrix [eq15] is idempotent and symmetric; moreover, when it is post-multiplied by $X_{ullet ,1}$, it gives as a result the residuals of a regression of $X_{ullet ,1}$ on $X_{ullet ,-1}$. The vector of these residuals is denoted by[eq16]Therefore, [eq17]If $X_{ullet ,1}$ has zero mean, the R squared of the regression of $X_{ullet ,1}$ on $X_{ullet ,-1}$ is [eq18]Note that this formula for the R squared is correct only if $X_{ullet ,1}$ has zero mean. Then, we can write[eq19]Therefore,[eq20]and[eq21]

If the k-th regressor is orthogonal to all the other regressors, we can write the variance of its estimated coefficient as[eq22]

Proof

As in the previous proof, we assume without loss of generality that $k=1$. In that proof, we have demonstrated that[eq23]If $X_{ullet ,1}$ is orthogonal to all the columns in $X_{ullet ,-1}$, then[eq24]Therefore, [eq25]

The VIF

Thus, the variance of [eq26] is the product of two terms:

  1. the variance that [eq26] would have if the k-th regressor were orthogonal to all the other regressors;

  2. the term [eq28], where $R_{k}$ is the R squared in a regression of the k-th regressor on all the other regressors.

The second term is called the variance inflation factor because it inflates the variance of [eq26] with respect to the base case of orthogonality.

Actual variance equals hypothetical variance times VIF.

Assumption

In order to derive the VIF, we have made the important assumption that the $k $-th regressor has zero mean.

If this assumption is not met, then it is incorrect to compute the VIF as [eq30]because the latter is no longer a factor in the formula that relates the actual variance of [eq26] to its hypothetical variance under the assumption of orthogonality.

Demeaned regression

One way to make sure that the zero-mean assumption is met is to run a demeaned regression: before computing the OLS coefficient estimates, we demean all the variables.

As explained in the lecture on partitioned regression, demeaning does not change the coefficient estimates, provided that the regression includes a constant.

Note that a demeaned regression is a special case of a standardized regression. Therefore, we can run a standardized regression before computing variance inflation factors.

Be careful: the VIF provides useful indications only if some assumptions are met.

Orthogonality and zero correlation

We have explained above that the VIF provides a comparison between the actual variance of a coefficient estimator and its hypothetical variance (under the assumption of orthogonality).

By definition, the k-th regressor is orthogonal to all the other regressors if and only if[eq32]for all $j
eq k$.

If the k-th regressor has zero mean, then the orthogonality condition is equivalent to saying that the k-th regressor is uncorrelated with all the other regressors.

Proof

Denote the sample means of $X_{ullet ,k}$ and $X_{ullet ,j}$ by $widehat{mu }_{k}$ and $widehat{mu }_{j}$. We assume that [eq33]. Then, the sample covariance between $X_{ullet ,k}$ and $X_{ullet ,j}$ is [eq34]Therefore, $X_{ullet ,k}$ and $X_{ullet ,j}$ are uncorrelated.

This is why, if the k-th regressor has zero mean, the VIF provides a comparison between:

How to actually compute the VIF

We usually compute the VIF for all the regressors. If there are many regressors and the sample size is large, computing the VIF as[eq35]can be quite burdensome because we need to run many large regressions (one for each k) in order to compute K different R squareds.

A better alternative is to use the equivalent formula[eq36]which can be easily derived from the formulae given above.

Proof

We have proved that[eq37]which implies that[eq38]

When we use the latter formula, we compute [eq39] only once. Then, we use its K diagonal entries to compute the K VIFs.

The numbers [eq40] in the denominator are easy to calculate because each of them is the reciprocal of the inner product of a vector with itself.

The formula for the VIF reported by most sources is hard to use in practice. There is a better formula that is much less expensive from a computational viewpoint.

Recipe for computation

Here is the final recipe for computing the variance inflation factors:

  1. Make sure that your regression includes a constant (otherwise this recipe cannot be used).

  2. Demean all the variables and drop the constant.

  3. Compute [eq39].

  4. For each k compute [eq42].

  5. The VIF for the k-th regressor is[eq43]

How to interpret the VIF

The VIF is equal to 1 if the regressor is uncorrelated with the other regressors, and greater than 1 in case of non-zero correlation.

The greater the VIF, the higher the degree of multicollinearity.

In the limit, when multicollinearity is perfect (i.e., the regressor is equal to a linear combination of other regressors), the VIF tends to infinity.

There is no precise rule for deciding when a VIF is too high (O'Brien 2007), but values above 10 are often considered a strong hint that trying to reduce the multicollinearity of the regression might be worthwhile.

What to do when the VIF is high and other details

In the lecture on Multicollinearity, we discuss in more detail the interpretation of the variance inflation factor, and we explain how to deal with multicollinearity.

References

O'Brien, R. (2007) A Caution Regarding Rules of Thumb for Variance Inflation Factors, Quality & Quantity, 41, 673-690.

Keep reading the glossary

Previous entry: Variance formula

How to cite

Please cite as:

Taboga, Marco (2021). "Variance inflation factor", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/glossary/variance-inflation-factor.

The books

Most of the learning materials found on this website are now available in a traditional textbook format.