StatLect
Index > Glossary

Information matrix

The information matrix (also called Fisher information matrix) is the matrix of second cross-moments of the score vector. The latter is the vector of first partial derivatives of the log-likelihood function with respect to its parameters.

Table of Contents

Definition

The information matrix is defined as follows.

Definition Let $	heta $ be a Kx1 parameter vector characterizing the distribution of a sample $xi $. Let [eq1] be the likelihood function of $xi $, depending on the parameter $	heta $. Let [eq2] be the log-likelihood function[eq3]Denote by [eq4]the score vector, that is, the Kx1 vector of first derivatives of [eq5] with respect to the entries of $	heta $. The information matrix [eq6] is the $K	imes K$ matrix of second cross-moments of the score, defined by[eq7]where the notation [eq8] indicates that the expected value is taken with respect to the probability distribution associated to the parameter $	heta $.

For example, if the sample $xi $ has a continuous distribution, then the likelihood function is[eq9]where [eq10] is the probability density function of $xi $, parametrized by $	heta $, and the information matrix is[eq11]

The information matrix is the covariance matrix of the score

Under mild regularity conditions, the expected value of the score is equal to zero:[eq12]As a consequence,[eq13]that is, the information matrix is the covariance matrix of the score.

Information equality

Under mild regularity conditions, it can be proved that[eq14]where [eq15] is the matrix of second-order cross-partial derivatives (so-called Hessian matrix) of the log-likelihood.

This equality is called information equality.

Information matrix of the normal distribution

As an example, consider a sample [eq16]made up of the realizations of n IID normal random variables with parameters mu and sigma^2 (mean and variance).

In this case, the information matrix is[eq17]

Proof

The log-likelihood function is [eq18]as proved in the lecture on maximum likelihood estimation of the parameters of the normal distribution. The score $s$ is a $2	imes 1$ vector whose entries are the partial derivatives of the log-likelihood with respect to mu and sigma^2: [eq19]The information matrix is[eq20]We have[eq21]where: in step $rame{A}$ we have used the fact that [eq22] for $i
eq j$ because the variables in the sample are independent and have mean equal to mu; in step $rame{B}$ we have used the fact that [eq23]Moreover,[eq24]where: in steps $rame{A}$ and $rame{B}$ we have used the independence of the observations in the sample and in step $rame{B}$ we have used the fact that the fourth central moment of the normal distribution is equal to [eq25]. Finally,[eq26]where: in step $rame{A}$ we have used the facts that [eq27] and that [eq28]for $i
eq j$ because the variables in the sample are independent; in step $rame{B}$ we have used the fact that the third central moment of the normal distribution is equal to zero.

More details

More details about the Fisher information matrix, including proofs of the information equality and of the fact that the expected value of the score is equal to zero, can be found in the lecture entitled Maximum likelihood.

Keep reading the glossary

Previous entry: Impossible event

Next entry: Integrable random variable

The book

Most of the learning materials found on this website are now available in a traditional textbook format.