The Multinoulli distribution (sometimes also called categorical distribution) is a generalization of the Bernoulli distribution. If you perform an experiment that can have only two outcomes (either success or failure), then a random variable that takes value 1 in case of success and value 0 in case of failure is a Bernoulli random variable. If you perform an experiment that can have outcomes and you denote by a random variable that takes value 1 if you obtain the -th outcome and 0 otherwise, then the random vector defined asis a Multinoulli random vector. In other words, when the -th outcome is obtained, the -th entry of the Multinoulli random vector takes value , while all other entries take value .
In what follows the probabilities of the possible outcomes will be denoted by .
The distribution is characterized as follows.
Definition Let be a discrete random vector. Let the support of be the set of vectors having one entry equal to and all other entries equal to :Let , ..., be strictly positive numbers such thatWe say that has a Multinoulli distribution with probabilities , ..., if its joint probability mass function is
If you are puzzled by the above definition of the joint pmf, note that when and because the -th outcome has been obtained, then all other entries are equal to and
The expected value of iswhere the vector is defined as follows:
The -th entry of , denoted by , is an indicator function of the event "the -th outcome has happened". Therefore, its expected value is equal to the probability of the event it indicates:
The covariance matrix of iswhere is a matrix whose generic entry is
We need to use the formula (see the lecture entitled Covariance matrix):If , thenwhere we have used the fact that because can take only values and . If , thenwhere we have used the fact that , because and cannot be both equal to at the same time.
The joint moment generating function of is defined for any :
If the -th outcome is obtained, then for and for . As a consequence,and the joint moment generating function is
The joint characteristic function of is
If the -th outcome is obtained, then for and for . As a consequence,and the joint characteristic function is
The following sections contain more details about the Multinoulli distribution.
A sum of independent Multinoulli random variables is a multinomial random variable. This is discussed and proved in the lecture entitled Multinomial distribution.
Most learning materials found on this website are now available in a traditional textbook format.