This lecture presents the score test, also known as Lagrange multiplier (LM) test.
The score test is used to conduct tests of hypotheses about parameters that have been estimated with the maximum likelihood (ML) method.
In order to better understand the material presented here, you should be familiar with the main concepts of hypothesis testing in a ML framework (see the introductory lecture entitled Maximum likelihood - Hypothesis testing).
The score test allows to deal with null hypotheses of the following kind:where is an unknown parameter belonging to a parameter space , and is a vector valued function ().
As we have explained in the introductory lecture mentioned above, most of the common parameter restrictions that one might want to test can be written in the form .
Table of contents
The statistic employed in the score test is based on the ML estimate that is obtained from the solution of the constrained optimization problemwhere is the sample of observed data, is the likelihood function, and is the set of parameters that satisfy the restriction that is being tested.
The test statistic, called score statistic, iswhere is the sample size, is a consistent estimate of the asymptotic covariance matrix of the estimator (see the lecture entitled Maximum likelihood - Covariance matrix estimation), and is the gradient of the log-likelihood function (called score), that is, the vector of partial derivatives of the log-likelihood function with respect to the entries of the parameter vector .
In order to derive the asymptotic properties of the statistic , the following assumptions will be maintained:
the sample and the likelihood function satisfy some set of conditions that are sufficient to guarantee consistency and asymptotic normality of (see the lecture on maximum likelihood estimation for a set of such conditions);
for each , the entries of are continuously differentiable with respect to all entries of ;
the matrix of the partial derivatives of the entries of with respect to the entries of , called the Jacobian of and denoted by , has rank .
Given these hypotheses, and under the null hypothesis that , the statistic converges in distribution to a Chi-square distribution.
Proposition Provided some technical conditions are satisfied (see above), and provided the null hypothesis is true, the score statistic converges in distribution to a Chi-square distribution with degrees of freedom.
Denote by the unconstrained maximum likelihood estimate:By the Mean Value Theorem, we have thatwhere is an intermediate point (a vector whose components are strictly comprised between the components of and those of ). Since , we have thatTherefore,Again by the Mean Value Theorem, we have thatwhere is the Hessian matrix (a matrix of second partial derivatives) and is an intermediate point (actually, to be precise, there is a different intermediate point for each row of the Hessian). Because the gradient is zero at an unconstrained maximum, we have thatand, as a consequence,and It descends that Now, where is a vector of Lagrange multipliers. Thus, we have thatSolving for , we obtain
Now, the score statistic can be written
asPlugging
in the previously derived expression for
,
the statistic
becomeswhereGiven
that under the null hypothesis both
and
converge in probability to
,
also
and
converge in probability to
,
because the entries of
and
are strictly comprised between the entries of
and
.
Moreover,where
is the asymptotic covariance matrix of
.
We had previously assumed that also
converges in probability to
.
Therefore, by the
continuous mapping
theorem, we have the following
resultsBy
putting together everything we have derived so far, we can write the score
statistic as a sequence of quadratic forms
whereand
But
in the lecture on the Wald test, we have proved
that such a sequence converges in distribution to a Chi-square random variable
with a number of degrees of freedom equal to
.
In the score test, the null hypothesis is rejected if the score statistic exceeds a pre-determined critical value , that is, if
The size of the test can be approximated by its asymptotic value
where is the distribution function of a Chi-square random variable with degrees of freedom.
We can choose so as to achieve a pre-determined size, as follows:
A simple example of how the score test can be used follows.
Example
Let the parameter space be the set of all
-dimensional
vectors, i.e.,
.
Denote the first and second component of the true parameter
by
and
.
Suppose we want to test the
restrictionIn
this case, the function
is a function
defined
byWe
have that
and the Jacobian of
iswhose
rank is equal to
.
Note also that it does not depend on
.
We then maximize the log-likelihood function with respect to
(keeping
fixed at
).
Suppose we obtain the following estimates of the parameter and of the
asymptotic covariance
matrix:where
is the sample size. Suppose also that the value of the score
isThen,
the score statistic is
The
statistic has a Chi-square distribution with
degrees of freedom. Suppose we want the size of our test to be
.
Then, the critical value
iswhere
is the cumulative distribution function of a Chi-square random variable with
degree of freedom and
can be calculated with any statistical software (we have done it in MATLAB,
using the command chi2inv(0.99,1)
). Thus, the test
statistic exceeds the critical
valueand
we reject the null hypothesis.
Most of the learning materials found on this website are now available in a traditional textbook format.