In linear regression analysis, the normal equations are a system of equations whose solution is the Ordinary Least Squares (OLS) estimator of the regression coefficients.
The normal equations are derived from the first-order condition of the Least Squares minimization problem.
Let us start from the simple linear regression
modelwhere:
is the dependent variable;
is the constant (or intercept);
is the regressor;
is the regression coefficient (or slope);
is the zero-mean error term.
There are
observations
in the sample:
.
The normal equations for the simple regression model
are:where
and
(the two unknowns) are the estimators of
and
.
The OLS estimators of
and
,
denoted by
and
,
are derived by minimizing the sum of squared
residuals:
We
carry out the minimization by computing the first-order conditions for a
minimum. In other words, we calculate the derivatives of
with respect to
and
,
and we set them equal to
zero:
We
divide the two equations by
and obtain the
equivalent
system
Since
we
can
write
which
are the two normal equations displayed above.
Thus, in the case of a simple linear regression, the normal equations are a
system of two equations in two unknowns
(
and
).
If the system has a unique
solution, then the two values of
and
that solve the system are the OLS estimators of the intercept
and the slope
respectively.
In a multiple linear regression, in which there is more than one regressor,
the regression equation can be written in matrix
form:where:
is the
vector of dependent
variables;
is the
matrix of regressors (the so-called design
matrix);
is the
vector of regression coefficients;
is the
vector of error terms.
The normal equations for the multiple regression model are expressed in
matrix
form
aswhere
the unknown
is a
vector (the estimator of
).
The OLS estimator of the vector
,
denoted by
,
is derived by minimizing the sum of squared residuals, which can be written in
matrix form as
follows:
In
order to find a minimizer, we compute the first-order condition for a minimum.
We calculate the gradient of
(the vector of partial derivatives with respect to the entries of
)
and we set it equal to
zero:
We
divide the equations by
and
obtain
which
is a system of normal equations expressed in matrix form.
Thus, in the case of the multiple regression model, the normal equations,
expressed above in matrix form, are a system of
equations in
unknowns (the
entries of the coefficient vector
).
If the system has a unique solution, the value of
that solves the system is the OLS estimator of the vector
.
As stated above, the normal equations are just a system of
linear equations in
unknowns.
Therefore, we can employ the standard methods for solving linear systems.
For example, if the equations are expressed in matrix form and the matrix
is invertible, we can write the solution
as
More mathematical details about the normal equations and the OLS estimator can be found in these lectures:
If you want to double check the formulae and the derivations shown above, you can check these references:
Greene, W.H. (2003). Econometric analysis, Fifth Edition. Prentice Hall.
Gujarati, D.N. (2004). Basic econometrics, Fourth Edition. McGraw-Hill.
Previous entry: Multinomial coefficient
Next entry: Null hypothesis
Please cite as:
Taboga, Marco (2021). "Normal equations", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/glossary/normal-equations.
Most of the learning materials found on this website are now available in a traditional textbook format.