Partitioned regression

Partitioned linear regression is a technique used to subdivide the independent variables in two groups and estimate their coefficients in two separate steps.

Partitioned regression is often used to solve problems in which estimating all the regression coefficients together would be too computationally intensive.

Table of contents

The regression model
Two groups of regressors
Partitioned matrices
The OLS estimator
Normal equations
The main result
Interpretation of the result
Frisch-Waugh-Lovell theorem
Example: demeaned regression

The regression model

Consider the linear regression model in matrix form:where:

is the vector of observations of the dependent variable;
is the matrix of regressors ( observations and regressors);
is the vector of regression coefficients;
is the vector of error terms.

Two groups of regressors

We divide the regressors in two groups:

group contains the first $K_{1}$ regressors;
group contains the remaining $K_{2}$ ;

Obviously $K_{1}+K_{2}=K$ .

Partitioned matrices

We use the subdivision into two groups to partition the vectors and matrices that appear in the regression equation: [eq2]

The dimensions of the blocks are as follows:

and are $N imes K_{1}$ and $N imes K_{2}$ respectively;
$eta _{1}$ and $eta _{2}$ are $K_{1} imes 1$ and $K_{2} imes 1$ respectively.

The OLS estimator

Remember that, when is full-rank, the OLS estimator of the vector can be written as

Also the OLS estimator can be partitioned as [eq4]

Normal equations

If we multiply both sides of the OLS formula by , we obtain the so-called normal equations:

In partitioned form, the normal equations become [eq7]

By using the multiplication rule for partitioned matrices, we obtain [eq8] which can be written as two separate equations: [eq9]

These two equations are used to derive most of the results about partitioned regressions.

The main result

Here is the main result about partitioned regressions, proved in this section and explained in the next one: [eq10]

Proof

The first normal equation, derived previously, isWe write it asorThe second normal equation isWe substitute the expression for :which is equivalent toorWe defineso thatorThe matrix $M_{1}$ is idempotent: [eq22] It is also symmetric, as can be easily verified: [eq23] Therefore, we have [eq24] where we have defined [eq25]

The calculations need to be performed in reverse order, starting from the last equation.

Interpretation of the result

Working out the formulae above is equivalent to deriving the OLS estimators in three steps:

we regress and the columns of on ; the residuals from these regressions are $y^{st }$ and $X_{2}^{st }$ ;
we find by regressing $y^{st }$ on $X_{2}^{st }$ ;
we calculate by regressing the residuals on .

Proof

Let us start from the first step. When we regress on , the OLS estimator of the regression coefficients isand the residuals are [eq30] Similarly, when we regress the columns of on , the OLS estimators of the regression coefficients are the columns of the matrixand the vectors of residuals are the columns of the matrix [eq32] In the second step, we regress $y^{st }$ on $X_{2}^{st }$ . The OLS estimator of the regression coefficients is But we have proved above that this is also the OLS estimator of $eta _{2}$ in our partitioned regression. In the third step, we regress on . The OLS estimator of the regression coefficients isBut we have proved above that this is also the OLS estimator of $eta _{1}$ in our partitioned regression.

Frisch-Waugh-Lovell theorem

The fact that can be calculated by regressing $y^{st }$ on $X_{2}^{st }$ is often called Frisch-Waugh-Lovell theorem.

Example: demeaned regression

As an example, we discuss the so-called demeaned regression.

Suppose that the first column of is a vector of ones (corresponding to the so-called intercept).

We partition the design matrix aswhere $1_{N}$ is the vector of ones and contains all the other regressors.

Let us see what happens in the 3 steps explained above.

First step

In the first step, we regress on $1_{N}$ .

The OLS estimate of the regression coefficient is [eq38] where is the sample mean of .

Therefore,

In other words, $y^{st }$ is the demeaned version of .

Similarly, $X_{2}^{st }$ is the demeaned version of :where $overline{x_{2}}$ is a row vector that contains the sample means of the columns of .

Second step

In the second step, we regress $y^{st }$ on $X_{2}^{st }$ and we obtain as a result .

The vector is equal to the OLS estimator of the regression coefficients of in the original regression of on $1_{N}$ and .

Thus, we have an important rule here: running a regression with an intercept is equivalent to demeaning all the variables and running the same regression without the intercept.

Third step

In the third step, we calculate (the intercept) by regressing the residuals on $1_{N}$ .

We already know that regressing a variable on $1_{N}$ is the same as calculating its sample mean.

Therefore, the intercept is equal to the sample mean of the residuals .

How to cite

Please cite as:

Taboga, Marco (2021). "Partitioned regression", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/partitioned-regression.