A cumulative distribution function (CDF) is a function associated with a Random variable that gives the Probability that the variable will take a value less than or equal to some value .
Formally, for a random variable , its cumulative density function is
where is a measure of probability. The probability that lies within the semi-closed interval is
If 's Probability distribution has a Probability density function , we can state
In this case, the CDF is said to identify the distribution.
Properties#
- The inverse of the CDF is defined as where . If is continuous, this is equivalent to the usual definition of inverse. is the Quantile function.
- If is continuous, then the random variable follows a Uniform distribution. If it does, then the random variable has as its CDF1.
- The CDF can exist even if the PDF doesn't: for instance, discrete probability distributions have a CDF without a PDF. More generally, a probability distribution has a PDF if and only if its CDF is absolutely continuous.
Empirical CDF#
The empirical cumulative distribution function (ECDF) is the CDF that is obtained from empirical measurements. Since measurements are discrete, the ECDF is always discrete. As the empirical data increases in number, the ECDF starts to approximate a continuous CDF.
Footnotes#
-
Basically, you can freely convert between and as long as you know and its inverse . This might seem like a forgettable property, but it's central to the inversion sampling method for generating data following a PDF . ↩