Search for probability and statistics terms on Statlect
StatLect

Confidence interval for the mean

by , PhD

This lecture shows how to derive confidence intervals for the mean of a normal distribution.

We tackle two different cases:

  1. when the variance of the distribution is known;

  2. when the variance is unknown.

In each case we derive the level of confidence and we discuss how it is set.

We conclude with two solved exercises.

The theory needed to fully understand the derivations can be found in the lecture on interval estimation.

Table of Contents

Known variance

We start from the simpler case in which the variance is known.

The sample

We observe the realizations of n independent random variables X_1, ..., X_n, all having a normal distribution with

The interval

To construct a confidence interval for the mean mu, we use the sample mean[eq1]

The confidence interval is[eq2]where [eq3] is a strictly positive constant.

We explain below how $z$ is chosen.

Coverage probability

The coverage probability is the probability that the confidence interval will include the true mean mu.

The coverage probability of $T_{n}$ is[eq4]where Z is a standard normal random variable.

Proof

The coverage probability can be written as[eq5]where we have defined[eq6] Given the assumptions made above, the sample mean Xbar_n has a normal distribution with mean mu and variance $sigma ^{2}/n$, as demonstrated in the lecture on Point estimation of the mean. If we de-mean a normal random variable and we dividing it by the square root of its variance, we obtain a standard normal random variable. Therefore, the variable Z has a standard normal distribution.

Level of confidence

The coverage probability does not depend on the unknown parameter mu.

Therefore, the level of confidence coincides with the coverage probability:[eq7]where Z is a standard normal random variable.

How to adjust the level of confidence

The level of confidence is chosen by the statistician, who adjusts the constant $z$ accordingly.

If the level of confidence is set equal to $c$, then [eq8]where F is the cumulative distribution function of a standard normal random variable.

Proof

The level of confidence can be written as[eq9]where we have used the fact that[eq10]by the symmetry of the standard normal distribution around 0. Therefore,[eq11]

Unknown variance

We now relax the assumption that the variance of the distribution is known.

The sample

We observe the realizations of n independent random variables X_1, ..., X_n, all having a normal distribution with

The interval

To construct a confidence interval for the mean mu, we use the sample mean[eq12]and the adjusted sample variance[eq13]

The confidence interval for the mean is:[eq14]where [eq3] is a strictly positive constant.

Coverage probability

The coverage probability of the confidence interval is[eq16]where $Z_{n-1}$ is a standard Student's t random variable with $n-1$ degrees of freedom.

Proof

The coverage probability can be written as[eq17]where we have defined[eq18]Now, rewrite $Z_{n-1}$ as[eq19]where we have defined[eq20]Given the assumptions made above, the adjusted sample variance $s_{n}^{2}$ has a Gamma distribution with parameters $n-1$ and sigma^2, as demonstrated in the lecture on Point estimation of variance. Therefore, the random variable $W$ has a Gamma distribution with parameters $n-1$ and 1. Moreover, the random variable Y has a standard normal distribution (see the previous section). Hence, $Z_{n-1}$ is the ratio between a standard normal random variable and the square root of a Gamma random variable with parameters $n-1$ and 1. As a consequence, $Z_{n-1}$ has a standard Student's t distribution with $n-1$ degrees of freedom (see the lecture on the Student's t distribution for a proof of this fact).

Level of confidence

The coverage probability does not depend on the unknown parameters mu and sigma^2.

Therefore, the level of confidence coincides with the coverage probability:[eq21]where $Z_{n-1}$ has a standard Student's t distribution with $n-1$ degrees of freedom.

How to adjust the level of confidence

As before, the constant $z$ is adjusted so as to achieve the desired level of confidence.

If the latter is equal to $c$, then [eq22]where F is the cumulative distribution function of a standard Student's t random variable with $n-1$ degrees of freedom.

Proof

The proof is identical to that we have shown above for the case of known variance. In fact, also the t distribution is symmetric around 0.

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Suppose that you observe a sample of 100 independent draws from a normal distribution having unknown mean mu and known variance $sigma ^{2}=1$.

Denote the draws by X_1, ..., $X_{100}$.

Their sample mean is[eq23]

Find a confidence interval for mu having $90%$ coverage probability.

Solution

For a given sample size n, the interval estimator[eq2]has coverage probability[eq25]where Z is a standard normal random variable and [eq3] is a strictly positive constant. Thus, we need to find $z$ such that[eq27]But[eq28]where the last equality stems from the fact that the standard normal distribution is symmetric around zero. Therefore $z$ must be such that[eq29]or[eq30]Using normal distribution tables or a computer program to find the value of $z$ (see the lecture entitled Normal distribution - Values), we obtain[eq31]Thus, the confidence interval for mu is[eq32]

Exercise 2

Suppose you observe a sample of 100 independent draws from a normal distribution having unknown mean mu and unknown variance sigma^2.

Denote the draws by X_1, ..., $X_{100}$.

The sample mean is[eq33]

The adjusted sample variance is[eq34]

Set the level of confidence at 99% and find a confidence interval for the mean mu.

Solution

For a given sample size n, the interval estimator[eq14]has coverage probability[eq16]where $Z_{n-1}$ is a standard Student's t random variable with $n-1$ degrees of freedom and [eq3] is a strictly positive constant. Thus, we need to find $z$ such that[eq38]But[eq28]where the last equality stems from the fact that the standard Student's t distribution is symmetric around zero. Therefore $z$ must be such that[eq40]or:[eq41]Using a computer program to find the value of $z$ (for example, with the MATLAB command tinv(0.995,99)), we obtain[eq42]Thus, the confidence interval for mu is[eq43]

How to cite

Please cite as:

Taboga, Marco (2021). "Confidence interval for the mean", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/set-estimation-mean.

The books

Most of the learning materials found on this website are now available in a traditional textbook format.