StatlectThe Digital Textbook
Index > Fundamentals of statistics > Set estimation

Set estimation of the mean

This lecture presents some examples of set estimation problems, focusing on set estimation of the mean, that is, on using a sample to produce a set estimate of the mean of an unknown distribution.

Table of Contents

Normal IID samples - Known variance

In this example we make assumptions that are similar to those we made in the example of point estimation of the mean entitled Mean estimation - Normal IID samples. It would be beneficial to read that example before reading this one.

The sample

In this example, the sample $xi _{n}$ is made of n independent draws from a normal distribution having unknown mean mu and known variance sigma^2. Specifically, we observe n realizations $x_{1}$, ..., $x_{n}$ of n independent random variables X_1, ..., X_n, all having a normal distribution with unknown mean mu and known variance sigma^2. The sample is the n-dimensional vector [eq1] which is a realization of the random vector [eq2]

The interval estimator

To construct an interval estimator of the mean mu, we use the sample mean[eq3]

The interval estimator is[eq4]where [eq5] is a strictly positive constant.

Coverage probability

The coverage probability of the interval estimator $T_{n}$ is[eq6]where Z is a standard normal random variable.

Proof

The coverage probability can be written as[eq7]where we have defined[eq8]In the lecture entitled Point estimation of the mean, we have demonstrated that, given the assumptions on the sample $xi _{n}$ made above, the sample mean Xbar_n has a normal distribution with mean mu and variance $sigma ^{2}/n$. Subtracting the mean of a normal random variable from the random variable itself and dividing it by the square root of its variance, one obtains a standard normal random variable. Therefore, the variable Z has a standard normal distribution.

Confidence coefficient

Note that the coverage probability does not depend on the unknown parameter mu. Therefore, the confidence coefficient of the interval estimator $T_{n}$ coincides with its coverage probability:[eq9]where Z is a standard normal random variable.

Size

The size of the interval estimator $T_{n}$ is[eq10]

Expected size

Note that the size does not depend on the sample $xi _{n}$. Therefore, the expected size is[eq11]

Normal IID samples - Unknown variance

This example is similar to the previous one. The only difference is that we now relax the assumption that the variance of the distribution is known.

The sample

In this example, the sample $xi _{n}$ is made of n independent draws from a normal distribution having unknown mean mu and unknown variance sigma^2. Specifically, we observe n realizations $x_{1}$, ..., $x_{n}$ of n independent random variables X_1, ..., X_n, all having a normal distribution with unknown mean mu and unknown variance sigma^2. The sample is the n-dimensional vector [eq12], which is a realization of the random vector [eq13].

The interval estimator

To construct interval estimators of the mean mu, we use the sample mean[eq14]

and either the unadjusted sample variance[eq15]

or the adjusted sample variance[eq16]We consider two interval estimators of the mean:[eq17]where [eq5] is a strictly positive constant and the superscripts $u$ and a indicate whether the estimator is based on the unadjusted or the adjusted sample variance.

Coverage probability

The coverage probability of the interval estimator $T_{n}^{u}$ is[eq19]where $Z_{n-1}$ is a standard Student's t random variable with $n-1$ degrees of freedom.

Proof

The coverage probability can be written as[eq20]where we have defined[eq21]Now, rewrite $Z_{n-1}$ as[eq22]where we have defined[eq23]and we have used the fact that the unadjusted sample variance can be expressed as a function of the adjusted sample variance as follows:[eq24]In the lecture entitled Point estimation of variance, we have demonstrated that, given the assumptions on the sample $xi _{n}$ made above, the adjusted sample variance $s_{n}^{2}$ has a Gamma distribution with parameters $n-1$ and sigma^2. Therefore, the random variable $W$ has a Gamma distribution with parameters $n-1$ and 1. Moreover, the random variable Y has a standard normal distribution (see the previous section). Hence, $Z_{n-1}$ is the ratio between a standard normal random variable and the square root of a Gamma random variable with parameters $n-1$ and 1. As a consequence, $Z_{n-1}$ has a standard Student's t distribution with $n-1$ degrees of freedom (see the lecture entitled Student's t distribution for a proof of this fact).

The coverage probability of the interval estimator $T_{n}^{a}$ is[eq25]where $Z_{n-1}$ is a standard Student's t random variable with $n-1$ degrees of freedom.

Proof

The coverage probability can be written as [eq26]where we have defined[eq27]Now, rewrite $Z_{n-1}$ as[eq28]where we have defined:[eq23]In the lecture entitled Point estimation of variance, we have demonstrated that, given the assumptions on the sample $xi _{n}$ made above, the adjusted sample variance $s_{n}^{2}$ has a Gamma distribution with parameters $n-1$ and sigma^2. Therefore, the random variable $W$ has a Gamma distribution with parameters $n-1$ and 1. Moreover, the random variable Y has a standard normal distribution (see the previous section). Hence, $Z_{n-1}$ is the ratio between a standard normal random variable and the square root of a Gamma random variable with parameters $n-1$ and 1. As a consequence, $Z_{n-1}$ has a standard Student's t distribution with $n-1$ degrees of freedom (see the lecture entitled Student's t distribution for a proof of this fact).

Note that the coverage probability of the confidence interval based on the unadjusted sample variance $S_{n}^{2}$ is lower than the coverage probability of the confidence interval based on the adjusted sample variance $s_{n}^{2}$ because[eq30]and, as a consequence[eq31]

Confidence coefficient

Note that the coverage probability of both $T_{n}^{u}$ and $T_{n}^{a}$ does not depend on the unknown parameters mu and sigma^2. Therefore, the confidence coefficients of the two confidence intervals coincide with the respective coverage probabilities:[eq32]where $Z_{n-1}$ has a standard Student's t distribution with $n-1$ degrees of freedom.

Size

The size of the confidence interval $T_{n}^{u}$ is[eq33]while the size of the confidence interval $T_{n}^{a}$ is[eq34]

Note that the size of the confidence interval based on the unadjusted sample variance $S_{n}^{2}$ is smaller than the size of the confidence interval based on the adjusted sample variance $s_{n}^{2}$ because[eq35]and, as a consequence,[eq36]

Thus, the confidence interval based on the unadjusted sample variance has a smaller size and a smaller coverage probability. As we have explained in the lecture entitled Set estimation, the choice of set estimators is often inspired by the principle of achieving the highest possible coverage probability for a given size or the smallest possible size for a given coverage probability. Following this principle, there is no clear ranking between the estimator based on the unadjusted sample variance and the estimator based on the adjusted sample variance, because the former has smaller size, but the latter has higher coverage probability.

Expected size

The expected size of $T_{n}^{u}$ is[eq37]where [eq38] is the Gamma function.

Proof

We need to use the fact that $S_{n}^{2}$ has a Gamma distribution with parameters $n-1$ and [eq39]. To simplify the notation, set[eq40]The probability density function of X is[eq41]where $c$ is a constant:[eq42]and [eq38] is the Gamma function. Therefore,[eq44]where we have defined[eq45]and we have used the fact that[eq46]because it is the integral of the density of a Gamma random variable with parameters n and sigma^2 over its support and probability densities integrate to 1. Thus,[eq47]

The expected size of $T_{n}^{a}$ is[eq48]where [eq38] is the Gamma function.

Proof

Using the fact that[eq50]we obtain[eq51]

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Suppose you observe a sample of $100$ independent draws from a normal distribution having unknown mean mu and known variance $sigma ^{2}=1$. Denote the $100$ draws by X_1, ..., $X_{100}$. Suppose their sample mean $overline{X}_{100}$ is equal to 1, that is,[eq52]

Find a confidence interval for mu, using a set estimator of mu having $90%$ coverage probability.

Solution

For a given sample size n, the interval estimator[eq4]has coverage probability[eq54]where Z is a standard normal random variable and [eq5] is a strictly positive constant. Thus, we need to find $z$ such that[eq56]But[eq57]where the last equality stems from the fact that the standard normal distribution is symmetric around zero. Therefore $z$ must be such that[eq58]or[eq59]Using normal distribution tables or a computer program to find the value of $z$ (see the lecture entitled Normal distribution - Values), we obtain[eq60]Thus, the confidence interval for mu is[eq61]

Exercise 2

Suppose you observe a sample of $100$ independent draws from a normal distribution having unknown mean mu and unknown variance sigma^2. Denote the $100$ draws by X_1, ..., $X_{100}$. Suppose their sample mean $overline{X}_{100}$ is equal to 1, i.e.:[eq62]and their adjusted sample variance $s_{100}^{2}$ is equal to $4$, that is,[eq63]

Find a confidence interval for mu, using a set estimator of mu having $99%$ coverage probability.

Solution

For a given sample size n, the interval estimator[eq64]has coverage probability[eq65]where $Z_{n-1}$ is a standard Student's t random variable with $n-1$ degrees of freedom and [eq5] is a strictly positive constant. Thus, we need to find $z$ such that[eq67]But[eq57]where the last equality stems from the fact that the standard Student's t distribution is symmetric around zero. Therefore $z$ must be such that[eq69]or:[eq70]Using a computer program to find the value of $z$ (for example, with the MATLAB command tinv(0.995,99)), we obtain[eq71]Thus, the confidence interval for mu is[eq72]

The book

Most of the learning materials found on this website are now available in a traditional textbook format.