The covariance of two random variables is a measure of how linearly dependent they are. For two jointly-distributed variables and with finite Variance and expected values and , their covariance is defined as
Unlike variance, which is strictly positive, covariance may take any real value. High positive values indicate strong correlation (increasing increases ), whereas high negative values indicate strong anticorrelation (decreasing decreases ).
The correlation coefficient or just correlation is a scale-independent form of the covariance, defined as
which is defined in . It has the same meaning as the covariance, but with normalized values. and are the standard deviations of and . By convention, it is said that two variables with correlation are weakly correlated, whereas variables with are strongly correlated.
Properties#
- It commutes: .
- If and are independent variables, then . The converse is not true. If , and are not in general independent. They are only linearly independent. They may still be nonlinearly dependent. The covariance does not provide information on nonlinear correlation.
Examples#
### Covariance matrix The **covariance matrix** $\Sigma$ of a random vector $\mathbf{X}=(X_{1},\ldots,X_{N})$ is the [[matrix]] that contains all of the variances and covariances of the system. It is defined by its elements $\Sigma_{ij}$ as\Sigma_{ij}\equiv\rho_{ij}\sigma_{i}\sigma_{j}
\Sigma\equiv \text{E}[(\mathbf{X}-\text{E}[\mathbf{X}])(\mathbf{X}-\text{E}[\mathbf{X}])^{T}]= \begin{pmatrix} \text{var}(X_{1}) & \text{cov}(X_{1},X_{2}) & \ldots & \text{cov}(X_{1},X_{N}) \ \text{cov}(X_{2},X_{1}) & \text{var}(X_{2}) & \ldots & \text{cov}(X_{2},X_{N}) \ \vdots & \vdots & \ddots & \vdots \ \text{cov}(X_{N},X_{1}) & \text{cov}(X_{N},X_{2}) & \ldots & \text{var}(X_{N}) \end{pmatrix}
#### Properties - Due to the commutativity of covariance, the covariance matrix is is a [[Symmetric matrix|symmetrical matrix]], so $\Sigma_{ij}=\Sigma_{ji}$. - It is [[Matrix sign definitions|positive semidefinite]]. Hence, it has nonnegative [[determinant]] $\det \Sigma\geq 0$ and is [[Invertible matrix|invertible]]. - The diagonal contains the variance of each random variable: $\Sigma_{ii}=\sigma ^{2}_{i}$. - If all variables are independent, it is [[Diagonalization|diagonal]]. - The covariance matrix of a linear relation is $\Sigma_{\mathrm{A}\mathbf{X}+\mathbf{b}}=\mathrm{A}\Sigma_{\mathbf{X}}\mathrm{A}^{T}$ ### Sample covariance Like the regular variance, calculating the covariance requires knowing the true mean of the random variables. When this is not known, the true means are estimated by the [[Arithmetic mean|sample means]] $\bar{X}$ and $\bar{Y}$ calculated on the [[sample|samples]] $X_{1},\ldots,X_{N}$ and $Y_{1},\ldots,Y_{N}$. Also just like the regular variance, we look for the average deviation from the sample mean:V_{X,Y,\text{biased}}=\frac{1}{N}\sum_{i=1}^{N} (X_{i}-\bar{X})(Y_{i}-\bar{Y})
V_{X,Y}=\frac{1}{N-1}\sum_{i=1}^{N} (X_{i}-\bar{X})(Y_{i}-\bar{Y})
This is the appropriate estimator for the true covariance, as $\mathrm{E}[V_{X,Y}]=\text{cov}(X,Y)$. The **sample correlation coefficient** can be be calculated using the sample variances and covariance:r=\frac{V_{X,Y}}{\sqrt{ S^{2}{X}S^{2}{Y} }}
*However*, unlike $S^{2}$ and $V$, which are quite well-behaved and unbiased (after correction), the sample correlation is *not*. $r$ is only unbiased asymptotically and, to make things worse, point estimates of $r$ usually have relatively high variance.