Probability Distribution: X X X Heads X Tails
Probability Distribution: X X X Heads X Tails
In probability theory and statistics, a probability distribution is the mathematical function that gives the
probabilities of occurrence of different possible outcomes for an experiment.[1][2] It is a mathematical
description of a random phenomenon in terms of its sample space and the probabilities of events (subsets of
the sample space).[3]
For instance, if X is used to denote the outcome of a coin toss ("the experiment"), then the probability
distribution of X would take the value 0.5 (1 in 2 or 1/2) for X = heads, and 0.5 for X = tails (assuming
that the coin is fair). Examples of random phenomena include the weather conditions at some future date,
the height of a randomly selected person, the fraction of male students in a school, the results of a survey to
be conducted, etc.[4]
Contents
Introduction
General definition
Terminology
Basic terms
Discrete probability distributions
Continuous probability distributions
Related terms
Cumulative distribution function
Discrete probability distribution
Cumulative distribution function
Dirac delta representation
Indicator-function representation
One-point distribution
Continuous probability distribution
Cumulative distribution function
Kolmogorov definition
Other kinds of distributions
Random number generation
Common probability distributions and their applications
Linear growth (e.g. errors, offsets)
Exponential growth (e.g. prices, incomes, populations)
Uniformly distributed quantities
Bernoulli trials (yes/no events, with a given probability)
Categorical outcomes (events with '"`UNIQ--postMath-0000009E-QINU`"' possible
outcomes)
Poisson process (events that occur independently with a given rate)
Absolute values of vectors with normally distributed components
Normally distributed quantities operated with sum of squares
As conjugate prior distributions in Bayesian inference
Some specialized applications of probability distributions
See also
Lists
References
Citations
Sources
External links
Introduction
A probability distribution is a mathematical description of the
probabilities of events, subsets of the sample space. The
sample space, often denoted by , is the set of all possible
outcomes of a random phenomenon being observed; it may be
any set: a set of real numbers, a set of vectors, a set of arbitrary
non-numerical values, etc. For example, the sample space of a
coin flip would be Ω = {heads, tails}.
In contrast, when a random variable takes values from a continuum then typically, any individual outcome
has probability zero and only events that include infinitely many outcomes, such as intervals, can have
positive probability. For example, consider measuring the weight of a piece of ham in the supermarket, and
assume the scale has many digits of precision. The probability that it weighs exactly 500 g is zero, as it will
most likely have some non-zero decimal digits. Nevertheless, one might demand, in quality control, that a
package of "500 g" of ham must weigh between 490 g and 510 g with at least 98% probability, and this
demand is less sensitive to the accuracy of measurement instruments.
Continuous probability distributions can be described in several ways. The probability density function
describes the infinitesimal probability of any given value, and the probability that the outcome lies in a
given interval can be computed by integrating the probability density function over that interval.[5] An
alternative description of the distribution is by means of the cumulative distribution function, which
describes the probability that the random variable is no larger than a given value (i.e., for some
). The cumulative distribution function is the area under the probability density function from to ,
as described by the picture to
the right.[6]
General definition
A probability distribution can
be described in various forms,
such as by a probability mass
function or a cumulative On the left is the probability density function. On the right is the cumulative
distribution function. One of distribution function, which is the area under the probability density curve.
the most general descriptions,
which applies for continuous
and discrete variables, is by means of a probability function whose input space is related to
the sample space, and gives a real number probability as its output. [7]
The probability function can take as argument subsets of the sample space itself, as in the coin toss
example, where the function was defined so that (heads) = 0.5 and (tails) = 0.5 . However,
because of the widespread use of random variables, which transform the sample space into a set of numbers
(e.g., , ), it is more common to study probability distributions whose argument are subsets of these
particular kinds of sets (number sets),[8] and all probability distributions discussed in this article are of this
type. It is common to denote as ( ) the probability that a certain variable belongs to a certain
event . [4][9]
The above probability function only characterizes a probability distribution if it satisfies all the Kolmogorov
axioms, that is:
The concept of probability function is made more rigorous by defining it as the element of a probability
space , where is the set of possible outcomes, is the set of all subsets whose
probability can be measured, and is the probability function, or probability measure, that assigns a
probability to each of these measurable subsets .[10]
Probability distributions usually belong to one of two classes. A discrete probability distribution is
applicable to the scenarios where the set of possible outcomes is discrete (e.g. a coin toss, a roll of a die)
and the probabilities are encoded by a discrete list of the probabilities of the outcomes; in this case the
discrete probability distribution is known as probability mass function. On the other hand, continuous
probability distributions are applicable to scenarios where the set of possible outcomes can take on values
in a continuous range (e.g. real numbers), such as the temperature on a given day. In the case of real
numbers, the continuous probability distribution is described by the cumulative distribution function. In the
continuous case, probabilities are described by a probability density function, and the probability
distribution is by definition the integral of the probability density function.[4][5][9] The normal distribution is
a commonly encountered continuous probability distribution. More complex experiments, such as those
involving stochastic processes defined in continuous time, may demand the use of more general probability
measures.
A probability distribution whose sample space is one-dimensional (for example real numbers, list of labels,
ordered labels or binary) is called univariate, while a distribution whose sample space is a vector space of
dimension 2 or more is called multivariate. A univariate distribution gives the probabilities of a single
random variable taking on various different values; a multivariate distribution (a joint probability
distribution) gives the probabilities of a random vector – a list of two or more random variables – taking on
various combinations of values. Important and commonly encountered univariate probability distributions
include the binomial distribution, the hypergeometric distribution, and the normal distribution. A commonly
encountered multivariate distribution is the multivariate normal distribution.
Besides the probability function, the cumulative distribution function, the probability mass function and the
probability density function, the moment generating function and the characteristic function also serve to
identify a probability distribution, as they uniquely determine an underlying cumulative distribution
function.[11]
Terminology
Some key concepts and terms, widely used in the literature on
the topic of probability distributions, are listed below.[1]
Cumulative distribution function: function evaluating the probability that will take a
value less than or equal to for a random variable (only for real-valued random variables).
Quantile function: the inverse of the cumulative distribution function. Gives such that, with
probability , will not exceed .
Related terms
Support: set of values that can be assumed with non-zero probability by the random
variable. For a random variable , it is sometimes denoted as .
Tail:[12] the regions close to the bounds of the random variable, if the pmf or pdf are relatively
low therein. Usually has the form , or a union thereof.
Head:[12] the region where the pmf or pdf is relatively high. Usually has the form .
Expected value or mean: the weighted average of the possible values, using their
probabilities as their weights; or the continuous analog thereof.
Median: the value such that the set of values less than the median, and the set greater than
the median, each have probabilities no greater than one-half.
Mode: for a discrete random variable, the value with highest probability; for a continuous
random variable, a location at which the probability density function has a local peak.
Quantile: the q-quantile is the value such that .
Variance: the second moment of the pmf or pdf about the mean; an important measure of the
dispersion of the distribution.
Standard deviation: the square root of the variance, and hence another measure of
dispersion.
Symmetry: a property of some distributions in which the portion of the distribution to the left
of a specific value (usually the median) is a mirror image of the portion to its right.
Skewness: a measure of the extent to which a pmf or pdf "leans" to one side of its mean.
The third standardized moment of the distribution.
Kurtosis: a measure of the "fatness" of the tails of a pmf or pdf. The fourth standardized
moment of the distribution.
The cumulative distribution function of any real-valued random variable has the properties:
is non-decreasing;
is right-continuous;
and ; and
.
Conversely, any function that satisfies the first four of the properties above is the cumulative
distribution function of some probability distribution on the real numbers.[13]
Any probability distribution can be decomposed as the sum of a discrete, a continuous and a singular
continuous distribution,[14] and thus any cumulative distribution function admits a decomposition as the
sum of the three according cumulative distribution functions.
Note that the points where the cdf jumps always form a countable set; this may be any countable set and
thus may even be dense in the real numbers.
A discrete probability distribution is often represented with Dirac measures, the probability distributions of
deterministic random variables. For any outcome , let be the Dirac measure concentrated at . Given a
discrete probability distribution, there is a countable set with and a probability mass
function . If is any event, then
or in short, .
Similarly, discrete distributions can be represented with the Dirac delta function as a generalized probability
density function , where , which means
Indicator-function representation
For a discrete random variable , let be the values it can take with non-zero probability.
Denote
It follows that the probability that takes any value except for is zero, and thus one can write
as
except on a set of probability zero, where is the indicator function of . This may serve as an
alternative definition of discrete random variables.
One-point distribution
A special case is the discrete distribution of a random variable that can take on only one fixed value; in
other words, it is a deterministic distribution. Expressed formally, the random variable has a one-point
distribution if it has a possible outcome such that [18] All other possible outcomes then
have probability 0. Its cumulative distribution function jumps immediately from 0 to 1.
.[20][21]
This is the definition of a probability density function, so that continuous probability distributions are
exactly those with a probability density function.
In particular, the probability for to take any single value
(that is, ) is zero, because an integral with coinciding upper and lower limits is always equal
to zero.
If the interval is replaced by any measurable set , the according equality still holds:
There are many examples of continuous probability distributions: normal, uniform, chi-squared, and others.
Continuous probability distributions as defined above are precisely those with an absolutely continuous
cumulative distribution function.
In this case, the cumulative distribution function has the form
Note on terminology: Some authors use the term "continuous distribution" to denote all distributions whose
cumulative distribution function is continuous, instead of requiring absolute continuity, which means all
distributions such that for all . This includes the (absolutely) continuous distributions
defined above, but it also includes singular distributions, which are neither absolutely continuous nor
discrete nor a mixture of those, and do not have a density. An example is given by the Cantor distribution.
For a more general definition of density functions and the equivalent absolute continuous measures see
absolutely continuous measure.
Kolmogorov definition
In the measure-theoretic formalization of probability theory, a random variable is defined as a measurable
function from a probability space to a measurable space . Given that probabilities of
events of the form satisfy Kolmogorov's probability axioms, the probability
distribution of is the pushforward measure of , which is a probability measure on
satisfying .[22][23][24]
One example is shown in the figure to the right, which One solution for the Rabinovich–Fabrikant
displays the evolution of a system of differential equations. What is the probability of observing a
state on a certain place of the support (i.e., the red
equations (commonly known as the Rabinovich–
subset)?
Fabrikant equations) that can be used to model the
behaviour of Langmuir waves in plasma.[26] When
this phenomenon is studied, the observed states from
the subset are as indicated in red. So one could ask what is the probability of observing a state in a certain
position of the red subset; if such a probability exists, it is called the probability measure of the
system.[27][25]
This kind of complicated support appears quite frequently in dynamical systems. It is not simple to establish
that the system has a probability measure, and the main problem is the following. Let be
instants in time and a subset of the support; if the probability measure exists for the system, one would
expect the frequency of observing states inside set would be equal in interval and , which
might not happen; for example, it could oscillate similar to a sine, , whose limit when does
not converge. Formally, the measure exists only if the limit of the relative frequency converges when the
system is observed into the infinite future.[28] The branch of dynamical systems that studies the existence of
a probability measure is ergodic theory.
Note that even in these cases, the probability distribution, if it exists, might still be termed "continuous" or
"discrete" depending on whether the support is uncountable or countable, respectively.
so that
This random variable X has a Bernoulli distribution with parameter .[29] Note that this is a transformation
of discrete random variable.
For a distribution function of a continuous random variable, a continuous random variable must be
constructed. , an inverse function of , relates to the uniform variable :
For example, suppose a random variable that has an exponential distribution must be
constructed.
A frequent problem in statistical simulations (the Monte Carlo method) is the generation of pseudo-random
numbers that are distributed in a given way.
The following is a list of some of the most common probability distributions, grouped by the type of
process that they are related to. For a more complete list, see list of probability distributions, which groups
by the nature of the outcome being considered (discrete, continuous, multivariate, etc.)
All of the univariate distributions below are singly peaked; that is, it is assumed that the values cluster
around a single point. In practice, actually observed quantities may cluster around multiple values. Such
quantities can be modeled using a mixture distribution.
See also
Conditional probability distribution
Joint probability distribution
Quasiprobability distribution
Empirical probability distribution
Histogram
Riemann–Stieltjes integral application to probability theory
Lists
List of probability distributions
List of statistical topics
References
Citations
1. Everitt, Brian. (2006). The Cambridge dictionary of statistics (3rd ed.). Cambridge, UK:
Cambridge University Press. ISBN 978-0-511-24688-3. OCLC 161828328 (https://www.worl
dcat.org/oclc/161828328).
2. Ash, Robert B. (2008). Basic probability theory (Dover ed.). Mineola, N.Y.: Dover
Publications. pp. 66–69. ISBN 978-0-486-46628-6. OCLC 190785258 (https://www.worldcat.
org/oclc/190785258).
3. Evans, Michael; Rosenthal, Jeffrey S. (2010). Probability and statistics: the science of
uncertainty (2nd ed.). New York: W.H. Freeman and Co. p. 38. ISBN 978-1-4292-2462-8.
OCLC 473463742 (https://www.worldcat.org/oclc/473463742).
4. Ross, Sheldon M. (2010). A first course in probability. Pearson.
5. "1.3.6.1. What is a Probability Distribution" (https://www.itl.nist.gov/div898/handbook/eda/sec
tion3/eda361.htm). www.itl.nist.gov. Retrieved 2020-09-10.
6. A modern introduction to probability and statistics : understanding why and how. Dekking,
Michel, 1946-. London: Springer. 2005. ISBN 978-1-85233-896-1. OCLC 262680588 (https://
www.worldcat.org/oclc/262680588).
7. Chapters 1 and 2 of Vapnik (1998)
8. Walpole, R.E.; Myers, R.H.; Myers, S.L.; Ye, K. (1999). Probability and statistics for
engineers. Prentice Hall.
9. DeGroot, Morris H.; Schervish, Mark J. (2002). Probability and Statistics. Addison-Wesley.
10. Billingsley, P. (1986). Probability and measure. Wiley. ISBN 9780471804789.
11. Shephard, N.G. (1991). "From characteristic function to distribution function: a simple
framework for the theory" (https://ora.ox.ac.uk/objects/uuid:a4c3ad11-74fe-458c-8d58-6f745
11a476c). Econometric Theory. 7 (4): 519–529. doi:10.1017/S0266466600004746 (https://d
oi.org/10.1017%2FS0266466600004746).
12. More information and examples can be found in the articles Heavy-tailed distribution, Long-
tailed distribution, fat-tailed distribution
13. Erhan, Çınlar (2011). Probability and stochastics. New York: Springer. p. 57.
ISBN 9780387878584.
14. see Lebesgue's decomposition theorem
15. Erhan, Çınlar (2011). Probability and stochastics. New York: Springer. p. 51.
ISBN 9780387878591. OCLC 710149819 (https://www.worldcat.org/oclc/710149819).
16. Cohn, Donald L. (1993). Measure theory. Birkhäuser.
17. Khuri, André I. (March 2004). "Applications of Dirac's delta function in statistics".
International Journal of Mathematical Education in Science and Technology. 35 (2): 185–
195. doi:10.1080/00207390310001638313 (https://doi.org/10.1080%2F0020739031000163
8313). ISSN 0020-739X (https://www.worldcat.org/issn/0020-739X). S2CID 122501973 (http
s://api.semanticscholar.org/CorpusID:122501973).
18. Fisz, Marek (1963). Probability Theory and Mathematical Statistics (3rd ed.). John Wiley &
Sons. p. 129. ISBN 0-471-26250-1.
19. Sheldon M. Ross (2010). Introduction to probability models. Elsevier.
20. Chapter 3.2 of DeGroot & Schervish (2002)
21. Bourne, Murray. "11. Probability Distributions - Concepts" (https://www.intmath.com/counting
-probability/11-probability-distributions-concepts.php). www.intmath.com. Retrieved
2020-09-10.
22. W., Stroock, Daniel (1999). Probability theory : an analytic view (Rev. ed.). Cambridge
[England]: Cambridge University Press. p. 11. ISBN 978-0521663496. OCLC 43953136 (htt
ps://www.worldcat.org/oclc/43953136).
23. Kolmogorov, Andrey (1950) [1933]. Foundations of the theory of probability. New York, USA:
Chelsea Publishing Company. pp. 21–24.
24. Joyce, David (2014). "Axioms of Probability" (https://mathcs.clarku.edu/~djoyce/ma217/axio
ms.pdf) (PDF). Clark University. Retrieved December 5, 2019.
25. Alligood, K.T.; Sauer, T.D.; Yorke, J.A. (1996). Chaos: an introduction to dynamical systems.
Springer.
26. Rabinovich, M.I.; Fabrikant, A.L. (1979). "Stochastic self-modulation of waves in
nonequilibrium media". J. Exp. Theor. Phys. 77: 617–629. Bibcode:1979JETP...50..311R (htt
ps://ui.adsabs.harvard.edu/abs/1979JETP...50..311R).
27. Section 1.9 of Ross, S.M.; Peköz, E.A. (2007). A second course in probability (http://people.b
u.edu/pekoz/A_Second_Course_in_Probability-Ross-Pekoz.pdf) (PDF).
28. Walters, Peter (2000). An Introduction to Ergodic Theory. Springer.
29. Dekking, Frederik Michel; Kraaikamp, Cornelis; Lopuhaä, Hendrik Paul; Meester, Ludolf
Erwin (2005), "Why probability and statistics?", A Modern Introduction to Probability and
Statistics, Springer London, pp. 1–11, doi:10.1007/1-84628-168-7_1 (https://doi.org/10.100
7%2F1-84628-168-7_1), ISBN 978-1-85233-896-1
30. Bishop, Christopher M. (2006). Pattern recognition and machine learning. New York:
Springer. ISBN 0-387-31073-8. OCLC 71008143 (https://www.worldcat.org/oclc/71008143).
31. Chang, Raymond. (2014). Physical chemistry for the chemical sciences. Thoman, John W.,
Jr., 1960-. [Mill Valley, California]. pp. 403–406. ISBN 978-1-68015-835-9.
OCLC 927509011 (https://www.worldcat.org/oclc/927509011).
32. Chen, P.; Chen, Z.; Bak-Jensen, B. (April 2008). "Probabilistic load flow: A review". 2008
Third International Conference on Electric Utility Deregulation and Restructuring and Power
Technologies. pp. 1586–1591. doi:10.1109/drpt.2008.4523658 (https://doi.org/10.1109%2Fd
rpt.2008.4523658). ISBN 978-7-900714-13-8. S2CID 18669309 (https://api.semanticscholar.
org/CorpusID:18669309).
33. Maity, Rajib (2018-04-30). Statistical methods in hydrology and hydroclimatology.
Singapore. ISBN 978-981-10-8779-0. OCLC 1038418263 (https://www.worldcat.org/oclc/10
38418263).
Sources
den Dekker, A. J.; Sijbers, J. (2014). "Data distributions in magnetic resonance images: A
review". Physica Medica. 30 (7): 725–741. doi:10.1016/j.ejmp.2014.05.002 (https://doi.org/1
0.1016%2Fj.ejmp.2014.05.002). PMID 25059432 (https://pubmed.ncbi.nlm.nih.gov/2505943
2).
Vapnik, Vladimir Naumovich (1998). Statistical Learning Theory. John Wiley and Sons.
External links
"Probability distribution" (https://www.encyclopediaofmath.org/index.php?title=Probability_di
stribution), Encyclopedia of Mathematics, EMS Press, 2001 [1994]
Field Guide to Continuous Probability Distributions (http://threeplusone.com/FieldGuide.pdf),
Gavin E. Crooks.