StatlectThe Digital Textbook
Index > Fundamentals of statistics > Maximum likelihood - Hypothesis testing

Score test

This lecture presents the score test, also known as Lagrange multiplier (LM) test.

The score test is used to conduct tests of hypotheses about parameters that have been estimated with the maximum likelihood (ML) method.

In order to better understand the material presented here, you should be familiar with the main concepts of hypothesis testing in a ML framework (see the introductory lecture entitled Maximum likelihood - Hypothesis testing).

The score test allows to deal with null hypotheses of the following kind:[eq1]where $	heta _{0}$ is an unknown parameter belonging to a parameter space [eq2], and [eq3] is a vector valued function ($rleq p$).

As we have explained in the introductory lecture mentioned above, most of the common parameter restrictions that one might want to test can be written in the form [eq4].

The score statistic

The statistic employed in the score test is based on the ML estimate [eq5] that is obtained from the solution of the constrained optimization problem[eq6]where $xi _{n}$ is the sample of observed data, [eq7] is the likelihood function, and [eq8]is the set of parameters that satisfy the restriction that is being tested.

The test statistic, called score statistic, is[eq9]where n is the sample size, $widehat{V}_{n}$ is a consistent estimate of the asymptotic covariance matrix of the estimator [eq10] (see the lecture entitled Maximum likelihood - Covariance matrix estimation), and [eq11]is the gradient of the log-likelihood function (called score), that is, the vector of partial derivatives of the log-likelihood function with respect to the entries of the parameter vector $	heta $.

In order to derive the asymptotic properties of the statistic $LM_{n}$, the following assumptions will be maintained:

Given these hypotheses, and under the null hypothesis that [eq14], the statistic $LM_{n}$ converges in distribution to a Chi-square distribution.

Proposition Provided some technical conditions are satisfied (see above), and provided the null hypothesis [eq15] is true, the score statistic $LM_{n}$ converges in distribution to a Chi-square distribution with $r$ degrees of freedom.


Denote by [eq16] the unconstrained maximum likelihood estimate:[eq17]By the Mean Value Theorem, we have that[eq18]where [eq19] is an intermediate point (a vector whose components are strictly comprised between the components of [eq20] and those of [eq21]). Since [eq22] $Theta _{R}$, we have that[eq23]Therefore,[eq24]Again by the Mean Value Theorem, we have that[eq25]where [eq26] is the Hessian matrix (a matrix of second partial derivatives) and [eq27] is an intermediate point (actually, to be precise, there is a different intermediate point for each row of the Hessian). Because the gradient is zero at an unconstrained maximum, we have that[eq28]and, as a consequence,[eq29]and [eq30]It descends that[eq31] Now, [eq32]where $lambda $ is a $r	imes 1$ vector of Lagrange multipliers. Thus, we have that[eq33]Solving for $lambda $, we obtain[eq34]

Now, the score statistic can be written as[eq35]Plugging in the previously derived expression for $lambda ,$, the statistic becomes[eq36]where[eq37]Given that under the null hypothesis both [eq38] and [eq39] converge in probability to $	heta _{0}$, also [eq40] and [eq41] converge in probability to $	heta _{0}$, because the entries of [eq42] and [eq43] are strictly comprised between the entries of [eq44] and [eq45]. Moreover,[eq46]where V is the asymptotic covariance matrix of [eq47]. We had previously assumed that also $widehat{V}_{n}$ converges in probability to V. Therefore, by the continuous mapping theorem, we have the following results[eq48]By putting together everything we have derived so far, we can write the score statistic as a sequence of quadratic forms [eq49]where[eq50]and [eq51]But in the lecture on the Wald test, we have proved that such a sequence converges in distribution to a Chi-square random variable with a number of degrees of freedom equal to [eq52].

The test

In the score test, the null hypothesis is rejected if the score statistic exceeds a pre-determined critical value $z$, that is, if[eq53]

The size of the test can be approximated by its asymptotic value[eq54]

where $Fleft( z
ight) $ is the distribution function of a Chi-square random variable with $r$ degrees of freedom.

We can choose $z$ so as to achieve a pre-determined size, as follows:[eq55]


A simple example of how the score test can be used follows.

Example Let the parameter space be the set of all $2$-dimensional vectors, i.e., [eq56]. Denote the first and second component of the true parameter $	heta _{0}$ by $	heta _{0,1}$ and $	heta _{0,2}$. Suppose we want to test the restriction[eq57]In this case, the function [eq58] is a function [eq59] defined by[eq60]We have that $r=1$ and the Jacobian of $g$ is[eq61]whose rank is equal to $r=1$. Note also that it does not depend on $	heta $. We then maximize the log-likelihood function with respect to $	heta _{2}$ (keeping $	heta _{1}$ fixed at $	heta _{1}=0$). Suppose we obtain the following estimates of the parameter and of the asymptotic covariance matrix:[eq62]where $70$ is the sample size. Suppose also that the value of the score is[eq63]Then, the score statistic is [eq64]The statistic has a Chi-square distribution with $r=1$ degrees of freedom. Suppose we want the size of our test to be $lpha =1%$. Then, the critical value $z$ is[eq65]where $Fleft( z
ight) $ is the cumulative distribution function of a Chi-square random variable with 1 degree of freedom and [eq66] can be calculated with any statistical software (we have done it in MATLAB, using the command chi2inv(0.99,1)). Thus, the test statistic exceeds the critical value[eq67]and we reject the null hypothesis.

The book

Most learning materials found on this website are now available in a traditional textbook format.