Expected value - Aetherwisp

The expected value or expectation $\text{E}[X]$ of a Random variable $X$ is a generalization of a weighted average over all possible values the variable can take. It is what the word "mean" typically refers to in the context of statistics, though there's many other possible meanings, such as the arithmetic mean. It is the first raw moment of the Probability distribution. The name "expectation" refers to both the value $\text{E}[X]$ itself and the operator $\text{E}$ that is applied onto $X$ . It is also commonly denoted with the letter $\mu$ .

The name can be misleading: the expected value is not the most likely value (that would be the mode). It is strictly theoretical and may not even be an allowed value of the random variable: for instance, the expected value of a fair six-sided die is 3.5, which is not even on the die.

The definition differs between discrete and continuous variables, and also between countable and uncountable outcomes.

Discrete variable with finite outcomes#

Given a discrete random variable $X$ with a finite set of possible outcomes $\{ x_{1},\ldots,x_{N} \}$ , with a Probability mass function $p_{X}(x)$ , the expected value is

\text{E}[X]=x_{1}p_{X}(x_{1})+\ldots+x_{n}p_{X}(x_{n})=\sum_{i=1}^{N} x_{i}p_{X}(x_{i})

which is just the average weighted by the probability of an outcome.

Discrete variable with countably infinite outcomes#

In the same conditions as above, but with a countably infinite set of outcomes $\{ x_{i} \}_{i}$ , the expected value can be easily defined by extending the sum as an infinite series:

\text{E}[X]=\sum_{i=1}^{\infty} x_{i}p_{X}(x_{i})

The Riemann series theorem states that the convergence value of some series with both positive and negative terms depends on the order in which the terms are given in. Since random variables are just that, random, it isn't possible to determine what order the terms are given in. Thus, this definition only holds if the series converges absolutely, in which case the order is not important. If the series is not absolutely convergent, this definition may not hold. If it doesn't (i.e. the series diverges), it is said that the variable does not have finite expectation.

Continuous variable#

Given a continuous random variable $X$ with a Probability density function $f_{X}(x)$ , the expected value is

\text{E}[X]=\int_{-\infty}^{\infty} xf_{X}(x) \ dx

Similarly to the series above, integrals may diverge, in which case the variable does not have finite expectation.

Multiple variables#

The expected value is not defined for multiple variables. However, it is possible to find the expected value of one variable among many from the joint distribution function.

For $N$ continuous random variables $X_{1},\ldots,X_{N}$ with JDF $f(x_{1},\ldots,x_{N})$ , the expected value of $X_{i}$ is

\text{E}[X_{i}]=\int_{\Omega_{N}}\ldots \int_{\Omega_{1}}x_{i}f(x_{1},\ldots,x_{N})dx_{1}\ldots dx_{N}

Properties#

The expectation has some useful properties:

If $X>0$ then $\text{E}[X]>0$ .
It is linear: $\text{E}[aX+bY]=a\text{E}[X]+b\text{E}[Y]$ , where $a$ and $b$ are constants. This follows from the linearity of a series or integral.
It is monotonous: If $X\leq Y$ , then $\text{E}[X]\leq \text{E}[Y]$ .
If $X=Y$ , then $\text{E}[X]=\text{E}[Y]$ .
If $\text{E}[\lvert X \rvert]=0$ then $X=0$ .
If $X=c$ for a constant $c$ , then $\text{E}[X]=c$ . As a consequence, since the expectation is a constant, the expectation operator is idempotent: $\text{E}[\text{E}[X]]=\text{E}[X]$ .
$\text{E}[XY]\neq \text{E}[X]\text{E}[Y]$ in general. It is guaranteed to be equal if $X$ and $Y$ are independent variables, but could theoretically be true even if they are dependent.

Expected value arrays#

When dealing with a random vector $\mathbf{X}=(X_{1},\ldots,X_{N})$ , the expected value vector or mean vector is the vector of expected values:

\text{E}[\mathbf{X}]=(\text{E}[X_{1}],\ldots,\text{E}[X_{N}])

In this case, the linearity properties looks like

\text{E}[\mathrm{A}\mathbf{X}+\mathbf{b}]=\mathrm{A}\text{E}[\mathbf{X}]+\mathbf{b}

where $\mathrm{A}$ is an $N\times N$ matrix and $\mathbf{b}$ is an $N$ -dimensional vector. Similar nomenclature applies to the expected value matrix.