Statistic


A statistic is a function of a statistical sample, in the sense of a set of random variables, used to quantify properties of the sample, generally for descriptive or testing purposes. It is itself a random variable. Examples include the sample mean and sample variance. See also Functions of random variables for general properties.

Sufficient statistic

Given a random vector X\mathbf{X} of JDF fX(x;θ)f_{\mathbf{X}}(\mathbf{x};\theta) (θ\theta is a parameter), a statistic t(X)t(\mathbf{X}) is said to be sufficient for θ\theta if it such that fX(x;θ)f_{\mathbf{X}}(\mathbf{x};\theta) can be written as

fX(x;θ)=h(X)g(t(X);θ)f_{X}(\mathbf{x};\theta)=h(\mathbf{X})g(t(\mathbf{X});\theta)

where hh is a statistic that is independent of θ\theta and gg is a statistic that depends on X\mathbf{X} only through t(X)t(\mathbf{X}). All the information available on θ\theta contained in X\mathbf{X} is supplied by t(X)t(\mathbf{X}).

Given a vector of iid Gaussian random variables, XiN(μ,σ2)X_{i}\sim \mathcal{N}(\mu,\sigma ^{2}), the parameters θ\boldsymbol{\theta} are θ=(μ,σ2)\boldsymbol{\theta}=(\mu,\sigma ^{2}). The PDF of the random vector is a multivariate normal:

f(X;θ)=i=1N12πσexp(12σ2(xiμ)2)=1(2π)NσNexp(12σ2i=1N(yiμ)2)\begin{align} f(\mathbf{X};\boldsymbol{\theta})&=\prod_{i=1}^{N} \frac{1}{\sqrt{ 2\pi }\sigma}\exp\left( - \frac{1}{2\sigma ^{2}}(x_{i}-\mu)^{2} \right) \\ &=\frac{1}{(\sqrt{ 2\pi })^{N}\sigma^{N}}\exp\left( - \frac{1}{2\sigma ^{2}}\sum_{i=1}^{N} (y_{i}-\mu)^{2} \right) \end{align}

It can be proven that the statistic t(X)=(yˉ,s2)t(\mathbf{X})=(\bar{y},s^{2}) is sufficient for θ\boldsymbol{\theta}. yˉ\bar{y} and s2s^{2} are the sample mean and sample variance of the random vector (interpreted as a sample).