The plug-in principle is a technique used in probability theory and statistics to approximately compute or to estimate a feature of a probability distribution (e.g., the expected value, the variance, a quantile) that cannot be computed exactly. It is widely used in the theories of Monte Carlo simulation and bootstrapping.
Roughly speaking, the plug-in principle says that a feature of a given distribution can be approximated by the same feature of the empirical distribution of a sample of observations drawn from the given distribution. The feature of the empirical distribution is called a plug-in estimate of the feature of the given distribution. For example, a quantile of a given distribution can be approximated by the analogous quantile of the empirical distribution of a sample of draws from the given distribution.
Table of contents
The following is a formal definition of plug-in estimate.
Definition
Let
be a set of distribution
functions. Let
be a mapping
.
Let
.
Let
be
a sample of realizations of
random variables
,
...,
all having distribution function
.
Let
be the empirical distribution function of the sample. If
,
then the
quantity
is called a plug-in estimate of
.
The next section will provide an informal discussion of the conditions under
which
converges to
as the sample size
increases. Before doing that, we will provide some examples to clarify the
meaning of the mapping
.
The next example shows how the plug-in principle can be used to approximate
expected values.
Example
Suppose we need to compute the expected value
where
is a random variable and
is a function. If
is the distribution function of
,
then the expected value can be written as a Riemann-Stieltjes integral (see
the lecture entitled
Expected value) as
follows
We
can define a mapping
such that, for any distribution function
,
we
have
Thus,
the expected value we need to compute
is
Now,
if we have a sample of
draws
,
...,
from the distribution
,
their empirical distribution function
is the distribution function of a discrete random variable that can take any
of the values
,
...,
with probability
.
As a consequence, the plug-in estimate of
is
The next example shows how the plug-in principle can be used to approximate quantiles.
Example
Suppose that we need to compute the
-quantile
of a random variable
having distribution function
,
and suppose that we are not able to compute it by using the definition of
-quantile
We
can define a mapping
such that, for any distribution function
,
we
have
Thus,
the quantile we need to compute
is
If
we have a sample of
draws
,
...,
from the distribution
,
and we denote by
their empirical distribution function, then the plug-in estimate of
is
where
is the ceiling of
,
that is, the smallest integer not less than
,
and
is the
-th
order statistic of the sample, that is, the
-th
smaller observation in the sample.
We will not go into the details of the asymptotic properties of plug-in estimators because this would require a level of mathematical sophistication far beyond the level required on average in these lecture notes. However, we will discuss the main issues related to their convergence and provide some intuition.
First of all, one may wonder whether the plug-in
estimateconverges
in some sense
to
as
the sample size
increases. As we have seen in the lecture entitled
Empirical
distribution, the Glivenko-Cantelli theorem and its generalizations
provide sets of conditions under which the empirical distribution
converges to
.
If
and
were finite-dimensional vectors, then one could apply the
Continuous Mapping
theorem and say that if
is continuous, then
converges to
.
Unfortunately,
and
are not finite-dimensional because they are functions defined on
(which is uncountable), and, as a consequence, it is not possible to apply the
Continuous Mapping Theorem. However, there are several theorems, analogous to
the Continuous Mapping theorem, that can be applied in the case of plug-in
estimators: if the mapping
is continuous in some sense (or differentiable), then
converges to
.
The continuity conditions required in these theorems are often complicated and
difficult to check in practical cases. We refer the interested reader to van
der Vaart (2000) for more details. Rest assured, however, that the most
commonly used mappings
(e.g., mean, variance, moments and cross-moments, quantiles) satisfy the
required continuity conditions.
Furthermore, it is also possible to prove that, under certain conditions, a
version of the Central
Limit Theorem applies to the plug-in estimate
,
that is, the
quantity
converges
in distribution to a normal random variable. If
and
were finite-dimensional vectors, then one could require that
be differentiable and apply the Delta Method
to prove the asymptotic normality of the above quantity. But since
and
are infinite dimensional, a more general technique, called Functional Delta
Method, needs to be employed (which utilizes a notion of differentiability for
that is called Hadamard differentiability). Again, we refer the interested
reader to van der Vaart (2000) for more details.
van der Vaart, A. W. (2000) Asymptotic statistics, Cambridge University Press.
Please cite as:
Taboga, Marco (2021). "The plug-in principle", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/asymptotic-theory/plug-in-principle.
Most of the learning materials found on this website are now available in a traditional textbook format.