StatlectThe Digital Textbook
Index > Fundamentals of probability

Random variables

A random variable is a variable whose value depends on the outcome of a probabilistic experiment. Its value is a priori unknown, but it becomes known once the outcome of the experiment is realized.

Table of Contents

Definition

Denote by Omega the sample space (the set of all possible outcomes of the experiment). A random variable associates a real number to each element of Omega, as stated by the following definition.

Definition A random variable X is a function from the sample space Omega to the set of real numbers R:[eq1]

In rigorous (measure-theoretic) probability theory, the function X is also required to be measurable (see a more rigorous definition of random variable).

The real number [eq2] associated to a sample point omega in Omega is called a realization of the random variable. The set of all possible realizations is called support and is denoted by R_X.

Notation

Some remarks on notation are in order:

  1. The dependence of X on omega is often omitted, that is, we simply write X instead of [eq3].

  2. If [eq4], the exact meaning of the notation [eq5] is the following:[eq6]

  3. If [eq4], we sometimes use the notation [eq8] with the following meaning:[eq9]In this case, $QTR{rm}{P}_{X}$ is to be interpreted as a probability measure on the set of real numbers, induced by the random variable X. Often, statisticians construct probabilistic models where a random variable X is defined by directly specifying $QTR{rm}{P}_{X}$, without specifying the sample space Omega.

Example

The following example illustrates how the realizations of a random variable are associated with the outcomes of a probabilistic experiment.

Example Suppose that we flip a coin. The possible outcomes are either tail ($T$) or head (H), that is,[eq10]The two outcomes are assigned equal probabilities:[eq11]If tail ($T$) is the outcome, we win one dollar, if head (H) is the outcome we lose one dollar. The amount X we win (or lose) is a random variable, defined as follows:[eq12]The probability of winning one dollar is[eq13]The probability of losing one dollar is[eq14]The probability of losing two dollars is[eq15]

Discrete random variables

Most of the time, statisticians deal with two special kinds of random variables:

  1. discrete random variables;

  2. absolutely continuous random variables.

This section defines the first kind (discrete) while the next section describes the second kind (absolutely continuous).

Definition A random variable X is discrete if

  1. its support R_X is a countable set;

  2. there is a function [eq16], called the probability mass function (or pmf or probability function) of X, such that, for any $xin U{211d} $:[eq17]

The following is an example of a discrete random variable.

Example A Bernoulli random variable is an example of a discrete random variable. It can take only two values: 1 with probability $q$ and 0 with probability $1-q$, where [eq18]. Its support is [eq19]. Its probability mass function is[eq20]

The properties of probability mass functions are discussed more in detail in the lecture entitled Legitimate probability mass functions. We anticipate here that probability mass functions are characterized by two fundamental properties.

  1. Non-negativity: [eq21] for any $xin U{211d} $;

  2. Sum over the support equals 1: [eq22].

It turns out not only that any probability mass function must satisfy these two properties, but also that any function satisfying these two properties is a legitimate probability mass function. You can find a detailed discussion of this fact in the aforementioned lecture.

Absolutely continuous random variables

Absolutely continuous random variables are defined as follows.

Definition A random variable X is absolutely continuous if

  1. its support R_X is not countable;

  2. there is a function [eq23], called the probability density function (or pdf or density function) of X, such that, for any interval [eq24]:[eq25]

Absolutely continuous random variables are often called continuous random variables, omitting the adverb absolutely.

The following is an example of an absolutely continuous random variable.

Example A uniform random variable (on the interval $left[ 0,1
ight] $) is an example of an absolutely continuous random variable. It can take any value in the interval $left[ 0,1
ight] $. All sub-intervals of equal length are equally likely. Its support is [eq26]. Its probability density function is[eq27]The probability that the realization of X belongs, for example, to the interval [eq28] is[eq29]

The properties of probability density functions are discussed more in detail in the lecture entitled Legitimate probability density functions. We anticipate here that probability density functions are characterized by two fundamental properties:

  1. Non-negativity: [eq30] for any $xin U{211d} $;

  2. Integral over R equals 1: [eq31].

It turns out not only that any probability density function must satisfy these two properties, but also that any function satisfying these two properties is a legitimate probability density function. You can find a detailed discussion of this fact in the aforementioned lecture.

Random variables in general

Random variables, also those that are neither discrete nor absolutely continuous, are often characterized in terms of their distribution function.

Definition Let X be a random variable. The distribution function (or cumulative distribution function or cdf ) of X is a function [eq32] such that[eq33]

If we know the distribution function of a random variable X, then we can easily compute the probability that X belongs to an interval [eq34] as[eq35]

Proof

Note that[eq36]where the two sets on the right hand side are disjoint. Hence, by additivity:[eq37]By rearranging terms, we get[eq35]

More details

In the following subsections you can find more details on random variables and univariate probability distributions.

Derivative of the distribution function of an absolutely continuous random variable

Note that, if X is absolutely continuous, then[eq39]Hence, by taking the derivative with respect to x of both sides of the above equation, we obtain[eq40]

Absolutely continuous random variables and zero-probability events

Note that, if X is an absolutely continuous random variable, the probability that X takes on any specific value $xin R_{X}$ is equal to zero:[eq41]Thus, the event [eq42] is a zero-probability event for any $xin R_{X}$. The lecture entitled Zero-probability events contains a thorough discussion of this apparently paradoxical fact: although it can happen that [eq43], the event [eq44] has zero probability of happening.

A more rigorous definition of random variable

Random variables can be defined in a more rigorous manner using the terminology of measure theory. Let [eq45] be a probability space. Let X be a function [eq46]. Let [eq47] be the Borel sigma-algebra of R (i.e., the smallest sigma-algebra containing all the open subsets of R). If, for any [eq48], [eq49]then X is a random variable on Omega. As a consequence, if X satisfies the above property, then for any [eq50], [eq51] can be defined as follows: [eq52]where the probability on the right hand side is well-defined because the set [eq53] is measurable by the very definition of random variable.

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

Let X be a discrete random variable. Let its support R_X be[eq54]

Let its probability mass function [eq55] be[eq56]

Calculate the following probability:[eq57]

Solution

By the additivity of probability, we have that[eq58]

Exercise 2

Let X be a discrete random variable. Let its support R_X be the set of the first $20$ natural numbers:[eq59]

Let its probability mass function [eq55] be[eq61]

Compute the probability[eq62]

Solution

By using the additivity of probability, we obtain[eq63]

Exercise 3

Let X be a discrete random variable. Let its support R_X be[eq64]

Let its probability mass function [eq55] be[eq66]where [eq67] is the binomial coefficient.

Calculate the probability[eq68]

Solution

First note that, by additivity:[eq69]

Therefore, in order to compute [eq70], we need to evaluate the probability mass function at the three points $x=0,$, $x=1$ and $x=2$:[eq71]

Finally,[eq72]

Exercise 4

Let X be an absolutely continuous random variable. Let its support R_X be[eq73]

Let its probability density function [eq74] be[eq75]

Compute[eq76]

Solution

The probability that an absolutely continuous random variable takes a value in a given interval is equal to the integral of the probability density function over that interval:[eq77]

Exercise 5

Let X be an absolutely continuous random variable. Let its support R_X be[eq73]

Let its probability density function [eq74] be[eq80]

Compute[eq81]

Solution

As in the previous exercise, the probability that X takes a value in a given interval is equal to the integral of its density function over that interval:[eq82]

Exercise 6

Let X be an absolutely continuous random variable. Let its support R_X be[eq83]

Let its probability density function [eq74] be[eq85]where $lambda >0$.

Compute[eq86]

Solution

As in the previous exercise, we need to compute an integral:[eq87]

The book

Most of the learning materials found on this website are now available in a traditional textbook format.