This lecture presents some examples of set
estimation problems, focusing on **set estimation of the
mean**, that is, on using a sample to produce a set estimate of the
mean of an unknown distribution.

Table of contents

In this example we make assumptions that are similar to those we made in the example of point estimation of the mean entitled Mean estimation - Normal IID samples. It would be beneficial to read that example before reading this one.

In this example, the sample is made of independent draws from a normal distribution having unknown mean and known variance . Specifically, we observe realizations , ..., of independent random variables , ..., , all having a normal distribution with unknown mean and known variance . The sample is the -dimensional vector which is a realization of the random vector

To construct an interval estimator of the mean , we use the sample mean

The interval estimator iswhere is a strictly positive constant.

The coverage probability of the interval estimator iswhere is a standard normal random variable.

Proof

The coverage probability can be written aswhere we have definedIn the lecture entitled Point estimation of the mean, we have demonstrated that, given the assumptions on the sample made above, the sample mean has a normal distribution with mean and variance . Subtracting the mean of a normal random variable from the random variable itself and dividing it by the square root of its variance, one obtains a standard normal random variable. Therefore, the variable has a standard normal distribution.

Note that the coverage probability does not depend on the unknown parameter . Therefore, the confidence coefficient of the interval estimator coincides with its coverage probability:where is a standard normal random variable.

The size of the interval estimator is

Note that the size does not depend on the sample . Therefore, the expected size is

This example is similar to the previous one. The only difference is that we now relax the assumption that the variance of the distribution is known.

In this example, the sample is made of independent draws from a normal distribution having unknown mean and unknown variance . Specifically, we observe realizations , ..., of independent random variables , ..., , all having a normal distribution with unknown mean and unknown variance . The sample is the -dimensional vector , which is a realization of the random vector .

To construct interval estimators of the mean , we use the sample mean

and either the unadjusted sample variance

or the adjusted sample varianceWe consider two interval estimators of the mean:where is a strictly positive constant and the superscripts and indicate whether the estimator is based on the unadjusted or the adjusted sample variance.

The coverage probability of the interval estimator iswhere is a standard Student's t random variable with degrees of freedom.

Proof

The coverage probability can be written aswhere we have definedNow, rewrite aswhere we have definedand we have used the fact that the unadjusted sample variance can be expressed as a function of the adjusted sample variance as follows:In the lecture entitled Point estimation of variance, we have demonstrated that, given the assumptions on the sample made above, the adjusted sample variance has a Gamma distribution with parameters and . Therefore, the random variable has a Gamma distribution with parameters and . Moreover, the random variable has a standard normal distribution (see the previous section). Hence, is the ratio between a standard normal random variable and the square root of a Gamma random variable with parameters and . As a consequence, has a standard Student's t distribution with degrees of freedom (see the lecture entitled Student's t distribution for a proof of this fact).

The coverage probability of the interval estimator iswhere is a standard Student's t random variable with degrees of freedom.

Proof

The coverage probability can be written as where we have definedNow, rewrite aswhere we have defined:In the lecture entitled Point estimation of variance, we have demonstrated that, given the assumptions on the sample made above, the adjusted sample variance has a Gamma distribution with parameters and . Therefore, the random variable has a Gamma distribution with parameters and . Moreover, the random variable has a standard normal distribution (see the previous section). Hence, is the ratio between a standard normal random variable and the square root of a Gamma random variable with parameters and . As a consequence, has a standard Student's t distribution with degrees of freedom (see the lecture entitled Student's t distribution for a proof of this fact).

Note that the coverage probability of the confidence interval based on the unadjusted sample variance is lower than the coverage probability of the confidence interval based on the adjusted sample variance becauseand, as a consequence

Note that the coverage probability of both and does not depend on the unknown parameters and . Therefore, the confidence coefficients of the two confidence intervals coincide with the respective coverage probabilities:where has a standard Student's t distribution with degrees of freedom.

The size of the confidence interval iswhile the size of the confidence interval is

Note that the size of the confidence interval based on the unadjusted sample variance is smaller than the size of the confidence interval based on the adjusted sample variance becauseand, as a consequence,

Thus, the confidence interval based on the unadjusted sample variance has a smaller size and a smaller coverage probability. As we have explained in the lecture entitled Set estimation, the choice of set estimators is often inspired by the principle of achieving the highest possible coverage probability for a given size or the smallest possible size for a given coverage probability. Following this principle, there is no clear ranking between the estimator based on the unadjusted sample variance and the estimator based on the adjusted sample variance, because the former has smaller size, but the latter has higher coverage probability.

The expected size of iswhere is the Gamma function.

Proof

We need to use the fact that has a Gamma distribution with parameters and . To simplify the notation, setThe probability density function of iswhere is a constant:and is the Gamma function. Therefore,where we have definedand we have used the fact thatbecause it is the integral of the density of a Gamma random variable with parameters and over its support and probability densities integrate to . Thus,

The expected size of iswhere is the Gamma function.

Proof

Using the fact thatwe obtain

Below you can find some exercises with explained solutions.

Suppose you observe a sample of independent draws from a normal distribution having unknown mean and known variance . Denote the draws by , ..., . Suppose their sample mean is equal to , that is,

Find a confidence interval for , using a set estimator of having coverage probability.

Solution

For a given sample size , the interval estimatorhas coverage probabilitywhere is a standard normal random variable and is a strictly positive constant. Thus, we need to find such thatButwhere the last equality stems from the fact that the standard normal distribution is symmetric around zero. Therefore must be such thatorUsing normal distribution tables or a computer program to find the value of (see the lecture entitled Normal distribution - Values), we obtainThus, the confidence interval for is

Suppose you observe a sample of independent draws from a normal distribution having unknown mean and unknown variance . Denote the draws by , ..., . Suppose their sample mean is equal to , i.e.:and their adjusted sample variance is equal to , that is,

Find a confidence interval for , using a set estimator of having coverage probability.

Solution

For a given sample size
,
the interval
estimatorhas
coverage
probabilitywhere
is a standard Student's t random variable with
degrees of freedom and
is a strictly positive constant. Thus, we need to find
such
thatButwhere
the last equality stems from the fact that the standard Student's t
distribution is symmetric around zero. Therefore
must be such
thator:Using
a computer program to find the value of
(for example, with the MATLAB command `tinv(0.995,99)`

),
we
obtainThus,
the confidence interval for
is

The book

Most of the learning materials found on this website are now available in a traditional textbook format.

Featured pages

- Bernoulli distribution
- F distribution
- Multinomial distribution
- Beta function
- Independent events
- Moment generating function

Explore

Main sections

- Mathematical tools
- Fundamentals of probability
- Probability distributions
- Asymptotic theory
- Fundamentals of statistics
- Glossary

About

Glossary entries

- Factorial
- Discrete random variable
- Continuous random variable
- Power function
- Critical value
- Alternative hypothesis

Share