The Lebesgue integral is used to give a completely general definition of expected value. This lecture introduces the Lebesgue integral, first in an intuitive manner and then in a more rigorous manner.
Let us recall the informal definition of expected value we have given in the lecure entitled Expected Value:
Definition
The expected value of a random
variable
is the weighted average of the values that
can take on, where each possible value is weighted by its respective
probability.
When
is discrete and can take on only finitely many values, it is straightforward
to compute the expected value of
,
by just applying the above definition. Denote by
,
...,
the
values that
can take on (the
elements of its support) and define the following
events:
i.e.
when the event
happens, then
equals
.
We can write the expected value of
as
i.e.
the expected value of
is the weighted average of the values that
can take on
(
,
...,
),
where each possible value
is weighted by its respective probability
.
Note that this is a way of expressing the expected value that uses neither
,
the distribution function
of
,
nor its probability mass
function
.
Instead, the above way of expressing the expected value uses only the
probability
defined on the events
.
In many applications, it turns out that this is a very convenient way of
expressing (and calculating) the expected value: for example, when the
distribution function
is not directly known and it is difficult to derive, it is sometimes easier to
directly compute the probabilities
defined on the events
.
Below, this will be illustrated with an example.
When
is discrete, but can take on infinitely many values, in a similar fashion we
can
write
In this case, however, there is a possibility that
is not well-defined: this happens when the infinite series above does not
converge, that is, when the
limit
does
not exist. In the next section we will show how to take care of this
possibility.
In the case in which
in not discrete (its support has the power of the continuum), things are much
more complicated. In this case, the above summation does not make any sense
(the support of
cannot be arranged into a sequence and so there is
no sequence over which we can sum). Thus, we have to find a workaround. The
workaround is similar to the one we have discussed in the presentation of the
Stieltjes integral: we build a simpler
random variable
that is a good approximation of
and whose expected value can easily be computed; then we make the
approximation better and better; finally, we define the expected value of
to be equal to the expected value of
when the approximation tends to become perfect.
How does the approximation work, intuitively? We illustrate it in three steps:
in the first step, we partition the sample space
into
events
,
...,
,
such that
for
and
in the second step we find, for each event
,
the smallest value that
can take on when the event
happens:
in the third step, we define the random variable
(which approximates
)
as
follows:
In this way, we have built a random variable
such that
for any
.
The finer the partition
,
...,
is, the better the approximation is: intuitively, when the sets
become smaller, then
becomes closer to the values that
takes on when
happens.
The expected value of
is, of course, easy to
compute:
The expected value of
is defined as
follows:
where
the notation
means that
becomes a better approximation of
(because the partition
,
...,
is made finer).
Several equivalent integral notations are used to denote the above
limit:and
the integral is called the Lebesgue integral of
with respect to the probability measure
.
The notation
(or
)
indicates that the sets
become very small by improving the approximation (making the partition
,
...,
finer); the integral notation
can be thought of as a shorthand for
;
appears in place of
in the integral, because the two tend to coincide when the approximation
becomes better and better.
An important property enjoyed by the Lebesgue integral is linearity.
Proposition
Let
and
be two random variables and let
and
be two constants.
Then,
The next example shows an important application of the linearity of the Lebesgue integral. The example also shows how the Lebesgue integral can, in certain situations, be much simpler to use than the Stieltjes integral when computing the expected value of a random variable.
Example
Let
and
be two random variables. We want to define (and compute) the expected value of
the sum
.
Define a new random variable
:
Using
the Stieltjes integral, the expected value is defined as
follows:
where
is the distribution function of
.
Hence, to compute the above integral, we first need to know the distribution
function of
(which might be extremely difficult to derive). By using the Lebesgue
integral, the expected value is defined as
follows:
However,
by linearity of the Lebesgue integral, we
obtain
Thus,
to compute the expected value of
,
we do not need to know the distribution function of
,
but we only need to know the expected values of
and
.
The example thus shows that linearity of the Lebesgue integral trivially translates into linearity of the expected value.
Proposition
Let
and
be two random variables and let
and
be two constants.
Then,
A more rigorous definition of the Lebesgue integral requires that we introduce
the notion of a simple random variable. A random variable
is called simple if it takes on finitely many positive values, that is, there
exist
events
,
...,
such that
for
and
and
furthermore
for all
.
Note that a simple random variable is also a discrete random variable. Hence, the expected value of a simple random variable is easy to compute (it is just the weighted sum of the elements of its support).
The Lebesgue integral of a simple random variable
is defined to be equal to its expected
value:
Let
be the random variable whose integral we want to compute. Let
and
be the positive and negative part of
respectively:
Note
that
,
for any
and
The Lebesgue integral of
is defined as
follows:
In
words, the Lebesgue integral of
is obtained by taking the supremum over the Lebesgue integrals of all the
simple functions
that are less than
.
The Lebesgue integral of
is defined as
follows:
Finally,
the Lebesgue integral of
is defined as the difference between the integrals of its positive and
negative
parts:
provided
the difference makes sense; in case both
and
are both equal to infinity, then the difference is not well-defined and we say
that
is not integrable.
Please cite as:
Taboga, Marco (2021). "Expected value and the Lebesgue integral", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-probability/expected-value-and-Lebesgue-integral.
Most of the learning materials found on this website are now available in a traditional textbook format.