Let be a sample space and let denote the probability assigned to the events . Suppose that, after assigning probabilites to the events in , we receive new information about the things that will happen (the possible outcomes). In particular, suppose that we are told that the realized outcome will belong to a set . How should we revise the probabilities assigned to the events in , to properly take the new information into account?
Denote by the revised probability assigned to an event after learning that the realized outcome will be an element of . is called the conditional probability of given .
Despite being an intuitive concept, conditional probability is quite difficult to define in a rigorous way. We take a gradual approach in this lecture. We first discuss conditional probability for the very special case in which all the sample points are equally likely. We then give a more general definition. Finally, we refer the reader to other lectures where conditional probability is defined in even more abstract ways.
Suppose a sample space has a finite number of sample points , i.e.:Suppose also that each sample point is assigned the same probability:In such a simple space, the probability of a generic event is obtained as:where denotes the cardinality of a set, i.e. the number of its elements. In other words, the probability of an event is obtained in two steps:
counting the number of 'cases that are favorable to the event ', i.e. the number of elements belonging to ;
dividing the number thus obtained by the number of 'all possible cases', i.e. the number of elements belonging to .
For example, if
When we learn that the realized outcome will belong to a set , we still apply the rule:
However, the number of all possible cases is now equal to the number of elements of , because only the outcomes beloning to are still possible. Furthermore, the number of favorable cases is now equal to the number of elements of , because the outcomes in are no longer possible. As a consequence:
Dividing numerator and denominator by one obtains:
Therefore, when all sample points are equally likely, conditional probabilities are computed as:
Example Suppose that we toss a die. Six numbers (from to can appear face up, but we do not yet know which one of them will appear. The sample space is:Each of the six numbers is a sample point and is assigned probability . Define the event as follows:where the event could be described as 'an odd number appears face up'. Now define the event as follows:where the event could be described as 'a number greater than three appears face up'. The probability of is:Suppose we are told that the realized outcome will belong to . How do we have to revise our assessment of the probability of the event , according to the rules of conditional probability? First of all, we need to compute the probability of the event :Then, the conditional probability of given is:
In the next section, we will show that the conditional probability formulais valid also for more general cases (i.e. when the sample points are not all equally likely). However, this formula already allows us to understand why defining conditional probability is a challenging task. In the conditional probability formula, a division by is performed. This division is impossible when is a zero-probability event (i.e. ). If we want to be able to define also when , then we need to give a more complicated definition of conditional probability. We will return to this point later.
In this section we give a more general definition of conditional probability, by taking an axiomatic approach. First, we list the properties that we would like conditional probability to satisfy. Then, we prove that the conditional probability formula introduced above satisfies these properties. The discussion of the case in which the conditional probability formula cannot be used because is postponed to the next section.
The conditional probability is required to satisfy the following properties:
Probability measure. has to satisfy all the properties of a probability measure.
Sure thing. .
Impossible events. If (, the complement of with respect to , is the set of all elements of that do not belong to ), then .
Constant likelihood ratios on . If , and , then:
These properties are very intutitve:
Probability measure. This property requires that also conditional probability measures satisfy the fundamental properties that any other probability measure needs to satisfy.
Sure thing. This property says that the probability of a sure thing must be : since we know that only things belonging to the set can happen, then the probability of must be .
Impossible events. This property says that the probability of an impossible thing must be : since we know that things not belonging to the set will not happen, then the probability of the events that are disjoint from must be .
Constant likelihood ratios on . This property is a bit more complex: it says that if is - say - two times more likely than before receiving the information , then remains two times more likely than , also after reiceiving the information, because all the things in and remain possible (can still happen) and, hence, there is no reason to expect that the ratio of their likelihoods changes.
Proposition (Conditional probability formula) Whenever , satisfies the four above properties if and only if:
We first show thatsatisfies the four properties whenever . As far as property 1) is concerned, we have to check that the three requirements for a probabilitiy measure are satisfied. The first requirement for a probability measure is that . Since , by the monotonicity of probability we have that:hence:Furthermore, since and , also The second requirement for a probability measure is that . This is satisfied because:The third requirement for a probability measure is that for any sequence of disjoint sets the following holds:But:so that also the third requirement is satisfied. Property 2) is trivially satisfied:Property 3) is verified because, if , then:Property 4) is verified because, if , and , then:So, the 'if' part has been proved. Now we prove the 'only if' part. We prove it by contradiction. Suppose there exist another conditional probability that satisfies the four properties. Then, there exists an event , such thatIt can not be that , otherwise we would have:which would be a contradiction, since if was a conditional probability it would satisfy:If is not a subset of then implies also , because:andbut this would also lead to a contradiction, because .
In the previous section we have generalized the concept of conditional probability. However, we have not been able to define the conditional probability for the case in which . This case is discussed in the lectures entitled Conditional probability as a random variable and Conditional probability distributions.
Let , ..., be events having the following characteristics:
they are mutually disjoint: whenever ;
they cover all the sample space:
they have strictly positive probability: for any .
, ..., is a partition of .
The law of total probability states that, for any event , the following holds:which can, of course, also be written as:
The law of total probability is proved as follows:
Some solved exercises on conditional probability can be found below:
Exercise set 1 (computation of conditional probability)