Partitioned linear regression is a technique used to subdivide the independent variables in two groups and estimate their coefficients in two separate steps.
Partitioned regression is often used to solve problems in which estimating all the regression coefficients together would be too computationally intensive.
Consider the linear
regression model in matrix
form:where:
is the
vector of observations of
the dependent variable;
is the
matrix of regressors
(
observations and
regressors);
is the
vector of regression coefficients;
is the
vector of error terms.
We divide the regressors in two groups:
group
contains the first
regressors;
group
contains the remaining
;
Obviously
.
We use the subdivision into two groups to
partition the vectors and matrices
that appear in the regression
equation:
The dimensions of the blocks are as follows:
and
are
and
respectively;
and
are
and
respectively.
Remember that, when
is full-rank, the
OLS
estimator of the vector
can be written
as
Also the OLS estimator can be partitioned
as
If we multiply both sides of the OLS formula by
,
we obtain the so-called normal
equations:
In partitioned form, the normal equations
become
By using the
multiplication rule
for partitioned matrices, we
obtainwhich
can be written as two separate
equations:
These two equations are used to derive most of the results about partitioned regressions.
Here is the main result about partitioned regressions, proved in this section
and explained in the next
one:
The first normal equation, derived
previously,
isWe
write it
as
or
The
second normal equation
is
We
substitute the expression for
:
which
is equivalent
to
or
We
define
so
that
or
The
matrix
is
idempotent:
It
is also symmetric, as can be easily
verified:
Therefore,
we
have
where
we have
defined
The calculations need to be performed in reverse order, starting from the last equation.
Working out the formulae above is equivalent to deriving the OLS estimators in three steps:
we regress
and the columns of
on
;
the residuals from these regressions are
and
;
we find
by regressing
on
;
we calculate
by regressing the residuals
on
.
Let us start from the first step. When we
regress
on
,
the OLS estimator of the regression coefficients
is
and
the residuals
are
Similarly,
when we regress the columns of
on
,
the OLS estimators of the regression coefficients are the columns of the
matrix
and
the vectors of residuals are the columns of the matrix
In
the second step, we regress
on
.
The OLS estimator of the regression coefficients is
But
we have proved above that this is also the OLS estimator of
in our partitioned regression. In the third step, we regress
on
.
The OLS estimator of the regression coefficients
is
But
we have proved above that this is also the OLS estimator of
in our partitioned regression.
The fact that
can be calculated by regressing
on
is often called Frisch-Waugh-Lovell theorem.
As an example, we discuss the so-called demeaned regression.
Suppose that the first column of
is a vector of ones (corresponding to the so-called intercept).
We partition the design matrix
aswhere
is the vector of ones and
contains all the other regressors.
Let us see what happens in the 3 steps explained above.
In the first step, we regress
on
.
The OLS estimate of the regression coefficient
iswhere
is the sample mean of
.
Therefore,
In other words,
is the demeaned version of
.
Similarly,
is the demeaned version of
:
where
is a
row vector that contains the sample means of the columns of
.
In the second step, we regress
on
and we obtain as a result
.
The vector
is equal to the OLS estimator of the regression coefficients of
in the original regression of
on
and
.
Thus, we have an important rule here: running a regression with an intercept is equivalent to demeaning all the variables and running the same regression without the intercept.
In the third step, we calculate
(the intercept) by regressing the residuals
on
.
We already know that regressing a variable on
is the same as calculating its sample mean.
Therefore, the intercept
is equal to the sample mean of the residuals
.
Please cite as:
Taboga, Marco (2021). "Partitioned regression", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/partitioned-regression.
Most of the learning materials found on this website are now available in a traditional textbook format.