This lecture presents some examples of point
estimation problems, focusing on **mean estimation**, that
is, on using a sample to produce a point estimate of the mean of an unknown
distribution.

Table of contents

In this example, which is probably the most important in the history of statistics, the sample is made of independent draws from a normal distribution having unknown mean and variance . Specifically, we observe realizations , ..., of independent random variables , ..., , all having a normal distribution with mean and variance . The sample is the -dimensional vector which is a realization of the random vector

As an estimator of the mean , we use the sample mean

The expected value of the estimator is equal to the true mean . This can be proved using the linearity of the expected value:Therefore, the estimator is unbiased.

The variance of the estimator is equal to . This can be proved using the formula for the variance of an independent sum:Therefore, the variance of the estimator tends to zero as the sample size tends to infinity.

The estimator has a normal distribution:

Proof

Note that the sample mean is a linear combination of the normal and independent random variables (all the coefficients of the linear combination are equal to ). Therefore, is normal because a linear combination of independent normal random variables is normal. The mean and the variance of the distribution have already been derived above.

The mean squared error of the estimator is

The sequence satisfies the conditions of Kolmogorov's Strong Law of Large Numbers ( is an IID sequence with finite mean). Therefore, the sample mean converges almost surely to the true mean :that is, the estimator is strongly consistent. Of course, the estimator is also weakly consistent because almost sure convergence implies convergence in probability:

In this example, the sample is made of independent draws from a probability distribution having unknown mean and variance . Specifically, we observe realizations , ..., of independent random variables , ..., , all having the same distribution with mean and variance . The sample is the -dimensional vector which is a realization of the random vector The difference with respect to the previous example is that now we are no longer assuming that the sample points come from a normal distribution.

Again, the estimator of the mean is the sample mean:

The expected value of the estimator is equal to the true mean and is therefore unbiased:

The proof is the same found in the previous example.

The variance of the estimator is

Also in this case the proof is the same found in the previous example.

Unlike in the previous example, the estimator does not necessarily have a normal distribution (its distribution depends on the distribution of the terms of the sequence ). However, we will see below that has a normal distribution asymptotically (i.e., it converges to a normal distribution when becomes large).

The mean squared error of the estimator is

The proof is the same found in the previous example.

The sequence satisfies the conditions of Kolmogorov's Strong Law of Large Numbers ( is an IID sequence with finite mean). Therefore, the estimator is both strongly consistent and weakly consistent (see example above).

The sequence satisfies the conditions of Lindeberg-Lévy Central Limit Theorem ( is an IID sequence with finite mean and variance). Therefore, the sample mean is asymptotically normal: where is a standard normal random variable and denotes convergence in distribution. In other words, the sample mean converges in distribution to a normal random variable with mean and variance .

Below you can find some exercises with explained solutions.

Consider an experiment that can have only two outcomes: either success, with probability , or failure, with probability . The probability of success is unknown, but we know thatSuppose we can independently repeat the experiment as many times as we wish and use the ratio as an estimator of . What is the minimum number of experiments needed in order to be sure that the standard deviation of the estimator is less than ?

Solution

Denote by the estimator of . It can be written aswhere is the number of repetitions of the experiment and are independent random variables having a Bernoulli distribution with parameter . Therefore, is the sample mean of independent Bernoulli random variables with expected value andThusWe need to ensure thatorwhich is certainly verified ifor

Suppose you observe a sample of independent draws from a distribution having unknown mean and known variance . How can you approximate the distribution of their sample mean?

Solution

We can approximate the distribution of the sample mean with its asymptotic distribution. So the distribution of the sample mean can be approximated by a normal distribution with mean and variance

The book

Most of the learning materials found on this website are now available in a traditional textbook format.

Featured pages

- Hypothesis testing
- Multinomial distribution
- Multivariate normal distribution
- Wishart distribution
- Set estimation
- Poisson distribution

Explore

Main sections

- Mathematical tools
- Fundamentals of probability
- Probability distributions
- Asymptotic theory
- Fundamentals of statistics
- Glossary

About

Glossary entries

- Continuous mapping theorem
- Probability space
- Integrable variable
- Probability density function
- Estimator
- Loss function

Share