Index > Fundamentals of statistics > Interval estimation

Confidence interval for the mean

by Marco Taboga, PhD

This lecture shows how to derive confidence intervals for the mean of a normal distribution.

We tackle two different cases:

when the variance of the distribution is known;
when the variance is unknown.

In each case we derive the level of confidence and we discuss how it is set.

We conclude with two solved exercises.

The theory needed to fully understand the derivations can be found in the lecture on interval estimation.

Table of contents

Known variance
Unknown variance
Solved exercises
1. Exercise 1
2. Exercise 2

Known variance

We start from the simpler case in which the variance is known.

The sample

We observe the realizations of independent random variables , ..., , all having a normal distribution with

unknown mean ;
known variance .

The interval

To construct a confidence interval for the mean , we use the sample mean [eq1]

The confidence interval is [eq2] where is a strictly positive constant.

We explain below how is chosen.

Coverage probability

The coverage probability is the probability that the confidence interval will include the true mean .

The coverage probability of $T_{n}$ iswhere is a standard normal random variable.

Proof

The coverage probability can be written as [eq5] where we have defined [eq6] Given the assumptions made above, the sample mean has a normal distribution with mean and variance $sigma ^{2}/n$ , as demonstrated in the lecture on Point estimation of the mean. If we de-mean a normal random variable and we dividing it by the square root of its variance, we obtain a standard normal random variable. Therefore, the variable has a standard normal distribution.

Level of confidence

The coverage probability does not depend on the unknown parameter .

Therefore, the level of confidence coincides with the coverage probability: [eq7] where is a standard normal random variable.

How to adjust the level of confidence

The level of confidence is chosen by the statistician, who adjusts the constant accordingly.

If the level of confidence is set equal to , then where is the cumulative distribution function of a standard normal random variable.

Proof

The level of confidence can be written as [eq9] where we have used the fact thatby the symmetry of the standard normal distribution around . Therefore,

Unknown variance

We now relax the assumption that the variance of the distribution is known.

The sample

We observe the realizations of independent random variables , ..., , all having a normal distribution with

unknown mean ;
unknown variance .

The interval

To construct a confidence interval for the mean , we use the sample mean [eq12] and the adjusted sample variance [eq13]

The confidence interval for the mean is: [eq14] where is a strictly positive constant.

Coverage probability

The coverage probability of the confidence interval iswhere $Z_{n-1}$ is a standard Student's t random variable with degrees of freedom.

Proof

The coverage probability can be written as [eq17] where we have defined [eq18] Now, rewrite $Z_{n-1}$ as [eq19] where we have defined [eq20] Given the assumptions made above, the adjusted sample variance $s_{n}^{2}$ has a Gamma distribution with parameters and , as demonstrated in the lecture on Point estimation of variance. Therefore, the random variable has a Gamma distribution with parameters and . Moreover, the random variable has a standard normal distribution (see the previous section). Hence, $Z_{n-1}$ is the ratio between a standard normal random variable and the square root of a Gamma random variable with parameters and . As a consequence, $Z_{n-1}$ has a standard Student's t distribution with degrees of freedom (see the lecture on the Student's t distribution for a proof of this fact).

Level of confidence

The coverage probability does not depend on the unknown parameters and .

Therefore, the level of confidence coincides with the coverage probability: [eq21] where $Z_{n-1}$ has a standard Student's t distribution with degrees of freedom.

How to adjust the level of confidence

As before, the constant is adjusted so as to achieve the desired level of confidence.

If the latter is equal to , then where is the cumulative distribution function of a standard Student's t random variable with degrees of freedom.

Proof

The proof is identical to that we have shown above for the case of known variance. In fact, also the t distribution is symmetric around .

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Suppose that you observe a sample of 100 independent draws from a normal distribution having unknown mean and known variance $sigma ^{2}=1$ .

Denote the draws by , ..., $X_{100}$ .

Their sample mean is [eq23]

Find a confidence interval for having coverage probability.

Solution

For a given sample size , the interval estimator [eq2] has coverage probabilitywhere is a standard normal random variable and is a strictly positive constant. Thus, we need to find such thatBut [eq28] where the last equality stems from the fact that the standard normal distribution is symmetric around zero. Therefore must be such thatorUsing normal distribution tables or a computer program to find the value of (see the lecture entitled Normal distribution - Values), we obtainThus, the confidence interval for is [eq32]

Exercise 2

Suppose you observe a sample of 100 independent draws from a normal distribution having unknown mean and unknown variance .

Denote the draws by , ..., $X_{100}$ .

The sample mean is [eq33]

The adjusted sample variance is [eq34]

Set the level of confidence at 99% and find a confidence interval for the mean .

Solution

For a given sample size , the interval estimator [eq14] has coverage probabilitywhere $Z_{n-1}$ is a standard Student's t random variable with degrees of freedom and is a strictly positive constant. Thus, we need to find such thatBut [eq28] where the last equality stems from the fact that the standard Student's t distribution is symmetric around zero. Therefore must be such thator:Using a computer program to find the value of (for example, with the MATLAB command tinv(0.995,99)), we obtainThus, the confidence interval for is [eq43]

How to cite

Please cite as:

Taboga, Marco (2021). "Confidence interval for the mean", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/set-estimation-mean.