Variance inflation factor

In regression analysis, the variance inflation factor (VIF) is a measure of the degree of multicollinearity of one regressor with the other regressors.

Table of contents

Multicollinearity
The linear regression
Matrix form
The OLS estimator
The variance of the coefficients
A convenient expression for the variance
The VIF
Assumption
Demeaned regression
Orthogonality and zero correlation
How to actually compute the VIF
Recipe for computation
How to interpret the VIF
What to do when the VIF is high and other details
References
Keep reading the glossary

Multicollinearity

Multicollinearity arises when a regressor is very similar to a linear combination of other regressors.

Multicollinearity has the effect of markedly increasing the variance of regression coefficient estimates. Therefore, we usually try to avoid it as much as possible.

To detect and measure multicollinearity, we use the so-called variance inflation factors.

The variance inflation factor is used to detect multicollinearity, a problem which inflates the variance of regression coefficient estimates.

The linear regression

Consider the linear regressionwhere:

$y_{i}$ is the dependent variable;
are regressors;
are regression coefficients;
$arepsilon _{i}$ is the error term;
the observations are indexed by .

Matrix form

The linear regression can be written in matrix form as:where:

and are vectors;
is an matrix;
is a vector.

The OLS estimator

If the design matrix has full rank, then we can compute the ordinary least squares (OLS) estimator of the vector of regression coefficients as follows:

The variance of the coefficients

Under certain assumptions (see, e.g., the lecture on the Gauss-Markov theorem), the covariance matrix of the OLS estimator is

Therefore, the variance of the OLS estimator of a single coefficient iswhere is the -th entry on the main diagonal of .

A convenient expression for the variance

If the -th regressor has zero mean, we can write the variance of its estimated coefficient as [eq10] where $R_{k}^{2}$ is the R squared obtained by regressing the -th regressor on all the other regressors.

Proof

Without loss of generality, suppose that (otherwise, change the order of the regressors). We can write the design matrix as a block matrix:where $X_{ullet ,1}$ is the first column of and the block $X_{ullet ,-1}$ contains all the other columns. Then, we have [eq12] We use Schur complements, and in particular the formula [eq13] to write the first entry of the inverse of $X^{ op }X$ as: [eq14] As proved in the lecture on partitioned regressions, the matrix is idempotent and symmetric; moreover, when it is post-multiplied by $X_{ullet ,1}$ , it gives as a result the residuals of a regression of $X_{ullet ,1}$ on $X_{ullet ,-1}$ . The vector of these residuals is denoted byTherefore, [eq17] If $X_{ullet ,1}$ has zero mean, the R squared of the regression of $X_{ullet ,1}$ on $X_{ullet ,-1}$ is [eq18] Note that this formula for the R squared is correct only if $X_{ullet ,1}$ has zero mean. Then, we can writeTherefore, [eq20] and [eq21]

If the -th regressor is orthogonal to all the other regressors, we can write the variance of its estimated coefficient as

Proof

As in the previous proof, we assume without loss of generality that . In that proof, we have demonstrated that [eq23] If $X_{ullet ,1}$ is orthogonal to all the columns in $X_{ullet ,-1}$ , thenTherefore, [eq25]

The VIF

Thus, the variance of is the product of two terms:

the variance that would have if the -th regressor were orthogonal to all the other regressors;
the term , where $R_{k}$ is the R squared in a regression of the -th regressor on all the other regressors.

The second term is called the variance inflation factor because it inflates the variance of with respect to the base case of orthogonality.

Actual variance equals hypothetical variance times VIF.

Assumption

In order to derive the VIF, we have made the important assumption that the -th regressor has zero mean.

If this assumption is not met, then it is incorrect to compute the VIF as [eq30] because the latter is no longer a factor in the formula that relates the actual variance of to its hypothetical variance under the assumption of orthogonality.

Demeaned regression

One way to make sure that the zero-mean assumption is met is to run a demeaned regression: before computing the OLS coefficient estimates, we demean all the variables.

As explained in the lecture on partitioned regression, demeaning does not change the coefficient estimates, provided that the regression includes a constant.

Note that a demeaned regression is a special case of a standardized regression. Therefore, we can run a standardized regression before computing variance inflation factors.

Be careful: the VIF provides useful indications only if some assumptions are met.

Orthogonality and zero correlation

We have explained above that the VIF provides a comparison between the actual variance of a coefficient estimator and its hypothetical variance (under the assumption of orthogonality).

By definition, the -th regressor is orthogonal to all the other regressors if and only iffor all .

If the -th regressor has zero mean, then the orthogonality condition is equivalent to saying that the -th regressor is uncorrelated with all the other regressors.

Proof

Denote the sample means of $X_{ullet ,k}$ and $X_{ullet ,j}$ by $widehat{mu }_{k}$ and $widehat{mu }_{j}$ . We assume that . Then, the sample covariance between $X_{ullet ,k}$ and $X_{ullet ,j}$ is [eq34] Therefore, $X_{ullet ,k}$ and $X_{ullet ,j}$ are uncorrelated.

This is why, if the -th regressor has zero mean, the VIF provides a comparison between:

the actual variance of a coefficient estimator;
the variance that the estimator would have if the corresponding variable were uncorrelated with all the other regressors.

How to actually compute the VIF

We usually compute the VIF for all the regressors. If there are many regressors and the sample size is large, computing the VIF as [eq35] can be quite burdensome because we need to run many large regressions (one for each ) in order to compute different R squareds.

A better alternative is to use the equivalent formula [eq36] which can be easily derived from the formulae given above.

Proof

We have proved that [eq37] which implies that [eq38]

When we use the latter formula, we compute only once. Then, we use its diagonal entries to compute the VIFs.

The numbers in the denominator are easy to calculate because each of them is the reciprocal of the inner product of a vector with itself.

The formula for the VIF reported by most sources is hard to use in practice. There is a better formula that is much less expensive from a computational viewpoint.

Recipe for computation

Here is the final recipe for computing the variance inflation factors:

Make sure that your regression includes a constant (otherwise this recipe cannot be used).
Demean all the variables and drop the constant.
Compute .
For each compute .
The VIF for the -th regressor is

How to interpret the VIF

The VIF is equal to 1 if the regressor is uncorrelated with the other regressors, and greater than 1 in case of non-zero correlation.

The greater the VIF, the higher the degree of multicollinearity.

In the limit, when multicollinearity is perfect (i.e., the regressor is equal to a linear combination of other regressors), the VIF tends to infinity.

There is no precise rule for deciding when a VIF is too high (O'Brien 2007), but values above 10 are often considered a strong hint that trying to reduce the multicollinearity of the regression might be worthwhile.

What to do when the VIF is high and other details

In the lecture on Multicollinearity, we discuss in more detail the interpretation of the variance inflation factor, and we explain how to deal with multicollinearity.

References

O'Brien, R. (2007) A Caution Regarding Rules of Thumb for Variance Inflation Factors, Quality & Quantity, 41, 673-690.

Keep reading the glossary

Previous entry: Variance formula

How to cite

Please cite as:

Taboga, Marco (2021). "Variance inflation factor", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/glossary/variance-inflation-factor.