The multinomial distribution is a multivariate discrete distribution that generalizes the binomial distribution.
If you perform
times a probabilistic experiment that can have only two outcomes, then the
number of times you obtain one of the two outcomes is a binomial random
variable.
If you perform
times an experiment that can have
outcomes
(
can be any natural number) and you denote by
the number of times that you obtain the
-th
outcome, then the random vector
defined
as
is
a multinomial random vector.
A multinomial vector can be seen as a sum of mutually independent Multinoulli random vectors.
This connection between the multinomial and Multinoulli distributions will be illustrated in detail in the rest of this lecture and will be used to demonstrate several properties of the multinomial distribution.
For this reason, we highly recommend to study the Multinoulli distribution before reading the following sections.
Multinomial random vectors are characterized as follows.
Definition
Let
be a
discrete random vector.
Let
.
Let the support of
be the set of
vectors having non-negative integer entries summing up to
:
Let
,
...,
be
strictly positive numbers such
that
We
say that
has a multinomial distribution with probabilities
,
...,
and number of trials
,
if its joint
probability mass function
is
where
is the multinomial coefficient.
The connection between the multinomial and the Multinoulli distribution is illustrated by the following propositions.
Proposition
If a random variable
has a multinomial distribution with probabilities
,
...,
and number of trials
,
then it has a Multinoulli distribution with probabilities
,
...,
.
The support of
is
and its joint probability mass function
is
But
because,
for each
,
either
or
and
.
As a
consequence,
which
is the joint probability mass function of a Multinoulli distribution.
Proposition
A random vector
having a multinomial distribution with parameters
and
can be written
as
where
are
independent random vectors all having a Multinoulli distribution with
parameters
.
The sum
is equal to the vector
when
Provided
for each
and
,
there are several different realizations of the vector
satisfying these conditions. Since
are Multinoulli variables, each of these realizations has
probability
(see
also the proof of the previous proposition). Furthermore, the number of the
realizations satisfying the above conditions is equal to the number of
partitions of
objects into
groups having numerosities
(see the lecture entitled Partitions),
which in turn is equal to the multinomial coefficient
Therefore,
which
proves that
and
have
the same distribution.
The expected value of a multinomial random vector
is
where
the
vector
is defined as
follows:
Using
the fact that
can be written as a sum of
Multinoulli variables with parameters
,
we
obtain
where
is the expected value of a Multinoulli random variable.
The covariance matrix of a multinomial random
vector
is
where
is a
matrix whose generic entry
is
Since
can be represented as a sum of
independent Multinoulli random variables with parameters
,
we
obtain
The joint moment generating function of a
multinomial random vector
is defined for any
:
Since
can be written as a sum of
independent Multinoulli random vectors with parameters
,
the joint moment generating function of
is derived from that of the
summands:
The joint characteristic
function of
is
The
derivation is similar to the derivation of the joint moment generating
function (see
above):
Below you can find some exercises with explained solutions.
A shop selling two items, labeled A and B, needs to construct a probabilistic
model of the sales that will be generated by its next 10 customers. Each time
a customer arrives, only three outcomes are possible: 1) nothing is sold; 2)
one unit of item A is sold; 3) one unit of item B is sold. It has been
estimated that the probabilities of these three outcomes are 0.50, 0.25 and
0.25 respectively. Furthermore, the shopping behavior of a customer is
independent of the shopping behavior of all other customers. Denote by
a
vector whose entries
and
are equal to the number of times each of the three outcomes occurs. Derive the
expected value and the covariance matrix of
.
The vector
has a multinomial distribution with
parameters
and
.
Therefore, its expected value
is
and
its covariance matrix
is
Given the assumptions made in the previous exercise, suppose that item A costs $1,000 and item B costs $2,000. Derive the expected value and the variance of the total revenue generated by the 10 customers.
The total revenue
can be written as a linear transformation of the vector
:
where
By
the linearity of the expected value operator, we
obtain
By
using the formula for the covariance matrix of a linear transformation, we
obtain
Please cite as:
Taboga, Marco (2021). "Multinomial distribution", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/probability-distributions/multinomial-distribution.
Most of the learning materials found on this website are now available in a traditional textbook format.