The concept of expected value of a random variable is one of the most important concepts in probability theory.
The concept was first devised in the 17th century to analyze gambling games and answer questions such as:
how much do I gain - or lose - on average, if I repeatedly play a given gambling game?
how much can I expect to gain - or lose - by making a certain bet?
If the possible outcomes of the game (or the bet) and their associated probabilities are described by a random variable, then these questions can be answered by computing its expected value.
The expected value is a weighted average of the possible realizations of the random variable (the possible outcomes of the game). Each realization is weighted by its probability.
For example, if you play a game where you gain 2$ with probability 1/2 and you
lose 1$ with probability 1/2, then the expected value of the game is half a
dollar:
What does this mean? Roughly speaking, it means that if you play this game many times, and the number of times each of the two possible outcomes occurs is proportional to its probability, then on average you gain 1/2$ each time you play the game.
For instance, if you play the game 100 times, win 50 times and lose the
remaining 50, then your average winning is equal to the expected
value:
In general, giving a rigorous definition of expected value requires quite a heavy mathematical apparatus. To keep things simple, we provide an informal definition of expected value and we discuss its computation in this lecture, while we relegate a more rigorous definition to the (optional) lecture entitled Expected value and the Lebesgue integral.
The following is an informal definition of expected value.
Definition (informal)
The expected value of a random variable
is the weighted average of the values that
can take on, where each possible value is weighted by its respective
probability.
The expected value of a random variable
is denoted by
and it is often called the expectation of
or the mean of
.
The following sections discuss how the expected value of a random variable is computed.
When
is a discrete random
variable having
support
and probability mass
function
,
the formula for computing its expected value is a straightforward
implementation of the informal definition given above: the expected value of
is the weighted average of the values that
can take on (the elements of
),
where each possible value
is weighted by its respective probability
.
Definition
Let
be a discrete random variable with support
and probability mass function
.
The expected value of
is
provided
that
The symbol
indicates
summation over all the elements of the support
.
For example, if
then
The requirement that
is
called absolute summability and ensures that the summation
is
well-defined also when the support
contains infinitely many elements.
When summing infinitely many terms, the order in which you sum them can change the result of the sum. However, if the terms are absolutely summable, then the order in which you sum becomes irrelevant.
In the above definition of expected value, the order of the
sumis
not specified, therefore the requirement of absolute summability is introduced
in order to ensure that the expected value is well-defined.
When the absolute summability condition is not satisfied, we say that the
expected value of
is not well-defined or that it does not exist.
Example
Let
be a random variable with support
and probability mass
function
Its
expected value
is
When
is a continuous
random variable with
probability density
function
,
the formula for computing its expected value involves an integral, which can
be thought of as the limiting case of the summation
found in the discrete case above.
Definition
Let
be a continuous random variable with probability density function
.
The expected value of
is
provided
that
Roughly speaking, this integral is the limiting case of the formula for the
expected value of a discrete random variable
Here,
is replaced by
(the infinitesimal probability of
)
and the integral sign
replaces the summation sign
.
The requirement that
is
called absolute integrability and ensures that the improper
integral
is
well-defined.
This improper integral is a shorthand
forand
it is well-defined only if both limits are finite. Absolute integrability
guarantees that the latter condition is met and that the expected value is
well-defined.
When the absolute integrability condition is not satisfied, we say that the
expected value of
is not well-defined or that it does not exist.
Example
Let
be a continuous random variable with support
and probability density
function
where
.
Its expected value
is
This section introduces a general formula for computing the expected value of
a random variable
.
The formula, which does not require
to be discrete or continuous and is applicable to any random variable,
involves an integral called Riemann-Stieltjes integral. While we briefly
discuss this formula for the sake of completeness, no deep understanding of
this formula or of the Riemann-Stieltjes integral is required to understand
the other lectures.
Definition
Let
be a random variable having
distribution function
.
The expected value of
is
where
the integral is a Riemann-Stieltjes integral and the
expected value exists and is well-defined only as long as the integral is
well-defined.
Roughly speaking, this integral is the limiting case of the formula for the
expected value of a discrete random variable
Here
replaces
(the probability of
)
and the integral sign
replaces the summation sign
.
The following section contains a brief and informal introduction to the Riemann-Stieltjes integral and an explanation of the above formula. Less technically oriented readers can safely skip it: when they encounter a Riemann-Stieltjes integral, they can just think of it as a formal notation which allows for a unified treatment of discrete and continuous random variables and can be treated as a sum in one case and as an ordinary Riemann integral in the other.
As we have already seen above, the expected value of a discrete random
variable is straightforward to compute: the expected value of a discrete
variable
is the weighted average of the values that
can take on (the elements of the support
),
where each possible value
is weighted by its respective probability
:
or,
written in a slightly different
fashion,
When
is not discrete the above summation does not make any sense. However, there is
a workaround that allows us to extend the formula to random variables that are
not discrete. The workaround entails approximating
with discrete variables that can take on only finitely many values.
Let
,
,...,
be
real numbers
(
)
such
that:
Define a new random variable
(function of
)
as
follows:
As the number
of points increases and the points become closer and closer (the maximum
distance between two successive points tends to zero),
becomes a very good approximation of
,
until, in the limit, it is indistinguishable from
.
The expected value of
is easy to
compute:
where
is the distribution function of
.
The expected value of
is then defined as the limit of
when
tends to infinity (i.e., when the approximation becomes better and
better):
When the latter limit exists and is well-defined, it is called the
Riemann-Stieltjes integral of
with respect to
and it is indicated as
follows:
Roughly speaking, the integral notation
can be thought of as a shorthand for
and the differential notation
can be thought of as a shorthand for
.
If you are not familiar with the Riemann-Stieltjes integral, make sure you also read the lecture entitled Computing the Riemann-Stieltjes integral: some rules, before reading the next example.
Example
Let
be a random variable with support
and distribution
function
Its
expected value
is
A completely general and rigorous definition of expected value is based on the Lebesgue integral. We report it below without further comments. Less technically inclined readers can safely skip it, while interested readers can read more about it in the lecture entitled Expected value and the Lebesgue integral.
Definition
Let
be a sample space,
a probability measure defined on the events of
and
a random variable defined on
.
The expected value of
is
provided
(the Lebesgue integral of
with respect to
)
exists and is well-defined.
The next sections contain more details about the expected value.
An important property of the expected value, known as transformation theorem, allows us to easily compute the expected value of a function of a random variable.
Let
be a random variable. Let
be a real function. Define a new random variable
as
follows:
Then,provided
the above integral exists.
This is an important property. It says that, if you need to compute the
expected value of
,
you do not need to know the support of
and its distribution function
:
you can compute it just by replacing
with
in the formula for the expected value of
.
For discrete random variables the formula becomes
while
for continuous random variables it
is
It
is possible (albeit non-trivial) to prove that the above two formulae hold
also when
is a
-dimensional
random vector,
is a real function of
variables and
.
When
is a discrete random
vector and
is its joint probability function,
then
When
is an continuous
random vector and
is its joint density function,
then
If
is a random variable and
is another random variable such
that
where
and
are two constants, then the following
holds:
For
discrete random variables this is proved as
follows:For
continuous random variables the proof
is
In
general, the linearity property is a consequence of the transformation theorem
and of the fact that the Riemann-Stieltjes integral is a linear
operator:
A stronger linearity property holds, which involves two (or more) random variables. The property can be proved only using the Lebesgue integral (see the lecture entitled Expected value and the Lebesgue integral).
The property is as follows: let
and
be two random variables and let
and
be two constants;
then
Let
be a
-dimensional
random vector and denote its components by
,
...,
.
The expected value of
,
denoted by
,
is just the vector of the expected values of the
components of
.
Suppose, for example, that
is a row vector;
then
Let
be a
random matrix, that is, a
matrix whose entries are random variables. Denote its
-th
entry by
.
The expected value of
,
denoted by
,
is just the matrix of the expected values of the entries of
:
Denote the absolute value of a random variable
by
.
If
exists and is finite, we say that
is an integrable random variable, or just that
is integrable.
Let
.
The space of all random variables
such that
exists and is finite is denoted by
or
,
where the triple
makes the dependence on the underlying
probability space explicit.
If
belongs to
,
we write
.
Hence, if
is integrable, we write
.
The following lectures contain more material about the expected value.
Introduces the conditional version of the expected value operator
Properties of the expected value
Statements, proofs and examples of the main properties of the expected value operator
Expected value and the Lebesgue integral
Provides a rigorous definition of expected value, based on the Lebesgue integral
Some solved exercises on expected value can be found below.
Let
be a discrete random variable. Let its support
be
Let its probability mass function
be
Compute the expected value of
.
Since
is discrete, its expected value is computed as a sum over the support of
:
Let
be a discrete variable with
support
and probability mass
function
Compute its expected value.
Since
is discrete, its expected value is computed as a sum over the support of
:
Let
be a discrete variable. Let its support
be
Let its probability mass function
be
Compute the expected value of
.
Since
is discrete, its expected value is computed as a sum over the support of
:
Let
be a continuous random variable with uniform
distribution on the interval
.
Its support
is
Its probability density function
is
Compute the expected value of
.
Since
is continuous, its expected value can be computed as an
integral:
Note that the trick is to: 1) subdivide the interval of integration to isolate the sub-intervals where the density is zero; 2) split up the integral among the various sub-intervals.
Let
be a continuous random variable. Its support
is
Its probability density function
is
Compute the expected value of
.
Since
is continuous, its expected value can be computed as an
integral:
Let
be a continuous random variable. Its support
is
Its probability density function
is
Compute the expected value of
.
Since
is continuous, its expected value can be computed as an
integral:
Please cite as:
Taboga, Marco (2021). "Expected value", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-probability/expected-value.
Most of the learning materials found on this website are now available in a traditional textbook format.