This lecture presents some examples of
Hypothesis testing, focusing on
**tests of hypothesis about the mean**, i.e., on using a sample
to perform tests of hypothesis about the mean of an unknown distribution.

Table of contents

In this example we make the same assumptions we made in the example of set estimation of the mean entitled Set estimation of the mean - Normal IID samples. The reader is strongly advised to read that example before reading this one.

In this example, the sample is made of independent draws from a normal distribution having unknown mean and known variance . Specifically, we observe realizations , ..., of independent random variables , ..., , all having a normal distribution with unknown mean and known variance . The sample is the -dimensional vector , which is a realization of the random vector .

We test the null hypothesis that the mean is equal to a specific value :

We assume that the parameter space is the whole real line, i.e., . Therefore, the alternative hypothesis is

To construct a test statistic, we use the sample mean

The test statistic
isThis
test statistic is often called **z-statistic** or **normal
z-statistic** and a test of hypothesis based on this statistic is
called **z-test** or **normal z-test**.

Let . We reject the null hypothesis if or if . In other words, the critical region isThus, the critical values of the test are and .

The power function of the test iswhere is a standard normal random variable and the notation is used to indicate the fact that the probability of rejecting the null hypothesis is computed under the hypothesis that the true mean is equal to .

Proof

The power function can be written aswhere we have definedAs demonstrated in the lecture entitled Point estimation of the mean, the sample mean has a normal distribution with mean and variance , given the assumptions on the sample we made above. Subtracting the mean of a normal random variable from the random variable itself and dividing it by the square root of its variance, one obtains a standard normal random variable. Therefore, the variable has a standard normal distribution.

When evaluated at the point , the power function is equal to the probability of committing a Type I error, i.e., the probability of rejecting the null hypothesis when the null hypothesis is true. This probability is called the size of the test and it is equal to where is a standard normal random variable (this is trivially obtained by substituting with in the formula for the power function found above).

This example is similar to the previous one. The only difference is that we now relax the assumption that the variance of the distribution is known.

In this example, the sample is made of independent draws from a normal distribution having unknown mean and unknown variance . Specifically, we observe realizations , ..., of independent random variables , ..., , all having a normal distribution with unknown mean and unknown variance . The sample is the -dimensional vector , which is a realization of the random vector .

We test the null hypothesis that the mean is equal to a specific value :

We assume that the parameter space is the whole real line, i.e., . Therefore, the alternative hypothesis is

We construct two test statistics, by using the sample meanand either the unadjusted sample varianceor the adjusted sample variance

The two test statistics
arewhere
the superscripts
and
indicate whether the test statistic is based on the unadjusted or the adjusted
sample variance. These two test statistics are often called
**t-statistics** or **Student's t-statistics** and
tests of hypothesis based on these statistics are called
**t-tests** or **Student's t-tests**.

Let . We reject the null hypothesis if or if (for or ). In other words, the critical region isThus, the critical values of the test are and .

The power function of the test based on the unadjusted sample variance iswhere the notation is used to indicate the fact that the probability of rejecting the null hypothesis is computed under the hypothesis that the true mean is equal to and is a non-central standard Student's t distribution with degrees of freedom and non-centrality parameter equal to

Proof

The power function can be written aswhere we have definedGiven the assumptions on the sample we made above, the sample mean has a normal distribution with mean and variance (see Point estimation of the mean), so that the random variablehas a standard normal distribution. Furthermore, the unadjusted sample variance has a Gamma distribution with parameters and (see Point estimation of the variance), so that the random variablehas a Gamma distribution with parameters and . Adding a constant to a standard normal distribution and dividing the sum thus obtained by the square root of a Gamma random variable with parameters and , one obtains a non-central standard Student's t distribution with degrees of freedom and non-centrality parameter . Therefore, the random variable has a non-central standard Student's t distribution with degrees of freedom and non-centrality parameter

The power function of the test based on the adjusted sample variance iswhere the notation is used to indicate the fact that the probability of rejecting the null hypothesis is computed under the hypothesis that the true mean is equal to and is a non-central standard Student's t distribution with degrees of freedom and non-centrality parameter equal to

Proof

The power function can be written aswhere we have definedGiven the assumptions on the sample we made above, the sample mean has a normal distribution with mean and variance (see Point estimation of the mean), so that the random variablehas a standard normal distribution. Furthermore, the adjusted sample variance has a Gamma distribution with parameters and (see Point estimation of the variance), so that the random variablehas a Gamma distribution with parameters and . Adding a constant to a standard normal distribution and dividing the sum thus obtained by the square root of a Gamma random variable with parameters and , one obtains a non-central standard Student's t distribution with degrees of freedom and non-centrality parameter . Therefore, the random variable has a non-central standard Student's t distribution with degrees of freedom and non-centrality parameter

Note that, for a fixed , the test based on the unadjusted sample variance is more powerful than the test based on the adjusted sample variance, i.e.,becauseand, as a consequence

The size of the test based on the unadjusted sample variance is equal to where is a standard Student's t distribution with degrees of freedom.

Proof

When evaluated at the point , the power function is equal to the size of the test, i.e. the probability of committing a Type I error. The power function evaluated at iswhere is a non-central standard Student's t distribution with degrees of freedom and non-centrality parameter equal toTherefore, when , the non-centrality parameter is equal to and is just a standard Student's t distribution.

The size of the test based on the adjusted sample variance is equal to where is a standard Student's t distribution with degrees of freedom.

Proof

Note that, for a fixed , the test based on the unadjusted sample variance has a greater size than the test based on the adjusted sample variance, because, as demonstrated above, the former also has a greater power than the latter for any value of the true parameter .

Below you can find some exercises with explained solutions.

Denote by the distribution function of a non-central standard Student's t distribution with degrees of freedom and non-centrality parameter equal to . Suppose a statistician observes independent realizations of a normal random variable. The mean and the variance of the random variable, which the statistician does not know, are equal to and respectively. What is the probability, expressed in terms of , that the statistician will reject the null hypothesis that the mean is equal to zero if she runs a t-test based on the observed realizations, she sets as the critical value, and she uses the adjusted sample variance to compute the t-statistic?

Solution

The probability of rejecting the null hypothesis is obtained by evaluating the power function of the test at :where the notation is used to indicate the fact that the probability of rejecting the null hypothesis is computed under the hypothesis that the true mean is equal to , and is a non-central standard Student's t distribution with degrees of freedom and non-centrality parameterThus, the probability of rejecting the null hypothesis is equal to

Denote by the distribution function of a standard Student's t distribution with degrees of freedom, and by its inverse. Suppose that a statistician observes independent realizations of a normal random variable, and she performs a t-test of the null hypothesis that the mean of the variable is equal to zero, based on the observed realizations, and using the unadjusted sample variance to compute the t-statistic. What critical value should she use in order to incur into a Type I error with 10% probability? Express it in terms of .

Solution

A Type I error is committed when the null hypothesis is true, but it is rejected. The probability of rejecting the null hypothesis is where is the critical value, and is a standard Student's t distribution with degrees of freedom. This probability can be expressed aswhere: in step we have used the fact that the density of a standard Student's t distribution is symmetric around zero. Thus, we need to set in such a way thatThis is accomplished by

The book

Most of the learning materials found on this website are now available in a traditional textbook format.