The binomial distribution is a real discrete Probability distribution that describes events that can only take two values: true or false, head or tails, etc.. It is described by one parameter: , which is the Probability of one of the two values occurring. We also define , which is the probability of the other value occurring.
For a Random variable , the binomial Probability mass function is
using the binomial coefficient, hence the name. It is the probability that events will all be the desired value (true, head...) over total attempts. must be between .
Moments#
The raw moment-generating function is
The central moment-generating function is
Some moments are:
- Raw 0.
- Central
0.
- (Variance)
- Coefficients
0. (skewness, goes to zero for as or if )
- (kurtosis, goes to zero for )
Relation to other distributions#
- For , we get a Bernoulli distribution, .
- When the sample size becomes large () but the number of successes doesn't increase (), it becomes a Poisson distribution, .
Histograms#
The binomial distribution has a special connection to histograms. Given a data sample, the number of events in each bin is a random variable that approximately follows the binomial distribution. Thus, the expected number of events in each bin is where is the probability of falling in that bin, and standard deviation . It is possible to find through the cumulative distribution function as , where and are the left and right edges of the -th bin.
This property is useful to analyze the dispersion of a histogram and how far it is from the intended distribution. This method is used in the Pearson goodness-of-fit chi-square test and its derivatives, like the independence test.
If the probability of landing in specific bins is small and the sample size is large, we can claim we are in the limit and use the Poisson distribution instead.