Binomial distribution


The binomial distribution is a real discrete Probability distribution that describes events that can only take two values: true or false, head or tails, etc.. It is described by one parameter: pp, which is the Probability of one of the two values occurring. We also define q=1pq=1-p, which is the probability of the other value occurring.

For a Random variable KK, the binomial Probability mass function is

Pk=P(k;n,p)=(nk)pkqnkP_{k}=P(k;n,p)=\begin{pmatrix}n \\ k\end{pmatrix}p^{k}q^{n-k}

using the binomial coefficient, hence the name. It is the probability that kk events will all be the desired value (true, head...) over nn total attempts. kk must be between 0kn0\leq k\leq n.

Moments

The raw moment-generating function is

MK(t)=E[etK]=k=0netk(nk)pkqnk=k=0n(nk)(etp)kqnk=(etp+q)nM_{K}^{*}(t)=E[e^{tK}]=\sum_{k=0}^{n} e^{tk}\begin{pmatrix}n \\ k\end{pmatrix}p^{k}q^{n-k}=\sum_{k=0}^{n} \begin{pmatrix}n \\ k\end{pmatrix}(e^{t}p)^{k}q^{n-k}=(e^{t}p+q)^{n}

The central moment-generating function is

MK(t)=E[et(Knp)]=etnpMk(t)=(etpetp+etpq)n=(etqp+etpq)nM_{K}(t)=E[e^{t(K-np)}]=e^{-tnp}M_{k}^{*}(t)=(e^{-tp}e^{t}p+e^{-tp}q)^{n}=(e^{tq}p+e^{-tp}q)^{n}

Some moments are:

  • Raw 0. μ0=1\mu_{0}^{*}=1
    1. μ1=np\mu_{1}^{*}=np (Expected value)
  • Central 0. μ0=1\mu_{0}=1
    1. μ1=0\mu_{1}=0
    2. μ2=npq\mu_{2}=npq (Variance)
    3. μ3=npq(qp)\mu_{3}=npq(q-p)
    4. μ4=(16pq+3npq)npq\mu_{4}=(1-6pq+3npq)npq
  • Coefficients 0. γ1=qpnpq\gamma_{1}=\frac{q-p}{\sqrt{ npq }} (skewness, goes to zero for as nn\to \infty or if p=1/2p=1/2)
    1. γ2=16pqnpq\gamma_{2}=\frac{1-6pq}{npq} (kurtosis, goes to zero for nn\to \infty)

Relation to other distributions

  • For n=1n=1, we get a Bernoulli distribution, Bernoulli(x;p)\text{Bernoulli}(x;p).
  • When the sample size becomes large (nn\to \infty) but the number of successes doesn't increase (npνnp\to \nu), it becomes a Poisson distribution, Pois(x;np)\text{Pois}(x;np).

Histograms

The binomial distribution has a special connection to histograms. Given a data sample, the number of events in each bin is a random variable that approximately follows the binomial distribution. Thus, the expected number of events in each bin is npnp where pp is the probability of falling in that bin, and standard deviation npq\sqrt{ npq }. It is possible to find pp through the cumulative distribution function F(x)F(x) as p=F(xi+1)F(xi)p=F(x_{i+1})-F(x_{i}), where xix_{i} and xi+1x_{i+1} are the left and right edges of the ii-th bin.

This property is useful to analyze the dispersion of a histogram and how far it is from the intended distribution. This method is used in the Pearson goodness-of-fit chi-square test and its derivatives, like the χ2\chi ^{2} independence test.

If the probability of landing in specific bins is small and the sample size is large, we can claim we are in the npνnp\to \nu limit and use the Poisson distribution instead.

Examples

> Unsurprisingly, as the number of attempts $n$ goes up, the probability that it'll occur also goes up.