Index

Digital textbook on probability and statistics

Statlect is a free on-line textbook on probability, statistics and matrix algebra. It contains hundreds of lectures, diagrams, examples and exercises. Explore its main sections.

Fundamentals of probability theory

Read a rigorous yet accessible introduction to the main concepts of probability theory, such as random variables, expected value, variance, correlation, conditional probability.

Probability distributions

Explore this compendium of common probability distributions, including the binomial, Poisson, uniform, exponential and normal distributions.

The number of repeated Bernoulli trials performed before obtaining a success has a geometric distribution.

Asymptotic theory

Learn about stochastic convergence, including convergence in probability, almost surely and in distribution; read about the Central Limit Theorem and the Law of Large Numbers.

Fundamentals of statistics

A book-length introduction to the basics of mathematical statistics; learn about statistical inference, point estimation, interval estimation and hypothesis testing.

Machine learning

Lectures on some of the most popular methods and models used in machine learning and predictive analytics, with thoroughly commented Python examples.

Glossary of probability and statistics terms

Use this on-line glossary to review the most important technical terms that are introduced in the digital textbook. Some glossary entries also contain additional explanations and examples.

SimpleR is StatLect's linear regression tool. You can estimate multiple linear regressions in seconds without coding.

Mathematical tools

Learn about mathematical concepts that are frequently used in probability theory and statistics.

Matrix algebra

This is an online textbook containing about one hundred lectures on the most important topics in matrix algebra.

Other mathematical tools

Review the basics of calculus, learn about the fundamentals of combinatorial analysis, such as permutations and combinations; discover special functions used in statistics.

Highlight

In the online textbook, hundreds of diagrams, images, plots and videos are used to illustrate important concepts in probability and statistics. Here is an example.

What's new

Read the latest additions to the digital textbook.

Multiple regression calculator

We created a linear regression tool that you can use in your browser to run regressions effortlessly and without coding.

Model misspecification

When some of the assumptions of a statistical model are wrong, then the model is misspecified. The consequences can be catastrophic.

Find our best images on our Pinterest board

Our Pinterest board collects all the best images in the textbook.

Follow us on Pinterest to get our best images, infographics and charts.

Keep in touch

We share the latest additions to StatLect on several social platforms. Subscribe to keep updated.

Popular pages

Here is a selection of popular pages on Statlect, subdivided by topic.

Probability theory

Moment generating function

The moment generating function is often used to characterize the probability distribution of a random variable. Its derivatives at zero are equal to the moments of the random variable.

Expected value

A gentle introduction to the concept of expected value, with an informal definition and more formal definitions based on the Stieltjes and Lebesgue integrals.

Bayes' rule

Bayes' rule is a formula that allows us to compute the conditional probability of a given event, after observing a second event whose conditional and unconditional probabilities were known in advance.

Beta function

The Beta function is often employed in probability theory and statistics, for example, as a normalizing constant in the density functions of the F and Student's t distributions.

Plots showing all the problems that may arise when quantiles are defined in a naive manner.

Probability distributions

Exponential distribution

The exponential distribution is a continuous probability distribution used to model the time we need to wait before a given event occurs.

Beta distribution

The Beta distribution is a continuous probability distribution having two parameters. One of its most common uses is to model one's uncertainty about the probability of success of an experiment.

Poisson distribution

The Poisson distribution is a discrete probability distribution used to model the number of occurrences of an unpredictable event within a unit of time.

Binomial distribution

A discrete distribution used to model the number of successes obtained by repeating several times an experiment that can have two outcomes, either success or failure.

One of the most popular plots in the probability and statistics textbook, used to illustrate the relationship between the Poisson and the exponential distributions.

Asymptotics

Convergence in probability

The concept of convergence in probability is based on the following intuition: two random variables are "close to each other" if there is a high probability that their difference will be very small.

Central Limit Theorems

A Central Limit Theorem provides a set of conditions that are sufficient for the sample mean to have a normal distribution asymptotically (as the sample size increases).

Statistics

Maximum likelihood

Maximum likelihood is an estimation method that allows us to use observed data to estimate the parameters of the probability distribution that generated the data.

Likelihood ratio test

A statistical test based on the comparison of two parameter estimates, a restricted one and an unrestricted one.

Linear regression models

The digital textbook contains several lectures on the linear regression model. This is the introductory one.

Probability density function of the z-statistic. The size of the test is the area in the two tails of the distribution.

Wald test

A test that is often performed on parameters that have been estimated by maximum likelihood, based on a test statistic called the Wald statistic.

Ridge regression

The ridge estimator of the coefficients of a linear regression is biased but can have lower mean squared error than the OLS estimator.

Model selection criteria

Model selection criteria, such as the Akaike Information Criterion (AIC) are used to select the best model among a set of candidate statistical models.

Logit model

The logit model is a classification model used to predict the realization of a binary variable on the basis of a set of regressors.

Multicollinearity

If an explanatory variable in a linear regression is highly correlated with a linear combination of other variables, then coefficient estimates are very imprecise.