Conditional models

This lecture introduces conditional probability models, a class of statistical models in which sample data are divided into input and output data and the relation between the two kind of data is studied by modelling the conditional probability distribution of the outputs given the inputs. This is in contrast to unconditional models (sometimes also called generative models) where the data is studied by modelling the joint distribution of inputs and outputs.

Table of contents

Introduction
Terminology

Regression and classification
Inputs
Outputs

Examples

Linear regression model
Logistic classification model

Introduction

Before introducing conditional models, let us review the main elements of a statistical model (see the lecture entitled Statistical inference):

there is a sample , which can be regarded as a realization of a random vector (for example, could be a vector collecting the realizations of some independent random variables);
the joint distribution function of the sample, denoted by , is not known exactly;
the sample is used to infer some characteristics of ;
a model for is used to make inferences, where a model is simply a set of joint distribution functions to which is assumed to belong.

In a conditional model, the sample is partitioned into inputs and outputs:where denotes the vector of outputs and the vector of inputs. The object of interest is the conditional distribution function of the outputs given the inputsand specifying a conditional model means specifying a set of conditional distribution functions to which is assumed to belong.

In other words, in a conditional model, the problem of model specification is simplified by narrowing the focus of the statistician's attention on the conditional distribution of the outputs and by ignoring the distribution of the inputs. This can be seen, for example, in the case in which both inputs and outputs are continuous random variables. In such a case, specifying an unconditional model is equivalent to specifying a joint probability density function for the inputs and the outputs. But a joint density can be seen as the product of a marginal and a conditional density:So, in an unconditional model we explicitly or implicitly specify both the marginal probability density function and the conditional probability density function . On the other hand, in a conditional model, we specify only the conditional and we leave the marginal unspecified.

Terminology

This section presents some of the terminology that is often used when dealing with conditional models.

Regression and classification

The following distinction is often made, especially in the field of machine learning:

if the output is a continuous random variable, then a conditional model is called a regression model;
if the output is a discrete random variable, taking finitely many values (typically few), then a conditional model is called a classification model.

Inputs

The input variables are often called:

predictors
independent variables
features
explanatory variables
regressors (in the context of regression models)

Outputs

The output variables are often called:

predictands
dependent variables
target variables
response variables
regressands (in the context of regression models)

Examples

The following subsections introduce some examples of conditional models.

Linear regression model

The linear regression model is probably the oldest, best understood and most widely used conditional model. In the linear regression model, the response variables are assumed to be a linear function of the inputs :where is any observation from the sample, $y_{i}$ is a scalar output, $x_{i}$ is a vector of inputs, is a vector of constants (called regression coefficients) and $arepsilon _{i}$ is an unobservable random variable that adds noise to the linear relationship between inputs and outputs.

A linear regression model is specified by making assumptions about the error term $arepsilon _{i}$ . For example, $arepsilon _{i}$ is often assumed to have a normal distribution with zero mean and to be independent of $x_{i}$ . In such a case, we have that, conditional on the inputs $x_{i}$ , the output $y_{i}$ has a normal distribution with mean $x_{i}eta$ . As a consequence, the conditional density of $y_{i}$ is [eq15] where is the variance of $arepsilon _{i}$ .

The parameters and are usually unknown and need to be estimated. So, we have a different conditional distribution for each of the values of and that are deemed plausible by the statistician before observing the sample. The set of all these conditional distributions (associated to the different parameters) constitutes the conditional model for .

To learn more about linear regression you can read:

the introductory lecture on linear regression models;
the lecture on the linear regression model with normal errors.

Logistic classification model

In the logistic classification model, the response variable $y_{i}$ is a Bernoulli random variable: it can take only two values, either or . It is assumed that the conditional probability mass function of $y_{i}$ is a non-linear function of the inputs $x_{i}$ : [eq17] where $x_{i}$ is a vector of inputs, is a vector of constants and is the logistic function defined by [eq19]

To know more, you can read the lecture about the logistic model.

How to cite

Please cite as:

Taboga, Marco (2021). "Conditional models", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/conditional-models.