Multinoulli distribution

The Multinoulli distribution (sometimes also called categorical distribution) is a multivariate discrete distribution that generalizes the Bernoulli distribution.

Table of contents

How the distribution is used
Definition
Expected value
Covariance matrix
Joint moment generating function
Joint characteristic function
Relation between the Multinoulli and the multinomial distribution

How the distribution is used

If you perform an experiment that can have only two outcomes (either success or failure), then a random variable that takes value 1 in case of success and value 0 in case of failure is a Bernoulli random variable.

If you perform an experiment that can have outcomes and you denote by a random variable that takes value 1 if you obtain the -th outcome and 0 otherwise, then the random vector defined asis a Multinoulli random vector.

In other words, when the -th outcome is obtained, the -th entry of the Multinoulli random vector takes value , while all the other entries are equal to .

In what follows the probabilities of the possible outcomes will be denoted by .

Definition

The distribution is characterized as follows.

Definition Let be a discrete random vector. Let the support of be the set of vectors having one entry equal to and all other entries equal to : [eq3] Let $p_{1}$ , ..., $p_{K}$ be strictly positive numbers such that [eq4] We say that has a Multinoulli distribution with probabilities $p_{1}$ , ..., $p_{K}$ if its joint probability mass function is [eq5]

If you are puzzled by the above definition of the joint pmf, note that when and $x_{i}=1$ because the -th outcome has been obtained, then all other entries are equal to and [eq7]

Expected value

The expected value of iswhere the vector is defined as follows:

Proof

The -th entry of , denoted by , is an indicator function of the event "the -th outcome has happened". Therefore, its expected value is equal to the probability of the event it indicates:

Covariance matrix

The covariance matrix of iswhere is a matrix whose generic entry is [eq12]

Proof

We need to use the formula (see the lecture entitled Covariance matrix):If , then [eq14] where we have used the fact that $X_{i}^{2}=X_{i}$ because can take only values and . If , then [eq15] where we have used the fact that $X_{i}X_{j}=0$ , because and $X_{j}$ cannot be both equal to at the same time.

Joint moment generating function

The joint moment generating function of is defined for any $tin U{211d} ^{K}$ : [eq16]

Proof

If the -th outcome is obtained, then $X_{i}=0$ for and $X_{i}=1$ for . As a consequence,and the joint moment generating function is [eq18]

Joint characteristic function

The joint characteristic function of is [eq19]

Proof

If the -th outcome is obtained, then $X_{i}=0$ for and $X_{i}=1$ for . As a consequence,and the joint characteristic function is [eq21]

Relation between the Multinoulli and the multinomial distribution

A sum of independent Multinoulli random variables is a multinomial random variable. This is discussed and proved in the lecture entitled Multinomial distribution.

How to cite

Please cite as:

Taboga, Marco (2021). "Multinoulli distribution", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/probability-distributions/multinoulli-distribution.