What Is a Covariance Matrix?
A covariance matrix is a square matrix that summarises how a set of variables vary — individually and together. For $k$ variables, it is a $k \times k$ matrix where the diagonal entries are the variances of each variable and the off-diagonal entries are the covariances between each pair. It is also called the variance–covariance matrix.
Covariance itself measures whether two variables move together: positive when they rise and fall together, negative when one rises as the other falls, and zero when they have no linear relationship. The covariance matrix collects all of these into one object, generalising the single-variable idea of variance to many dimensions at once. Because each entry is built from data, the whole matrix is a matrix you compute rather than one you are simply handed.
What Is the Covariance Matrix Formula?
Two formulas build the matrix: one for the diagonal (variance) and one for the off-diagonal (covariance). For a sample of $n$ data points, the sample variance of variable $X$ with mean $\bar{X}$ is:
$$\text{Var}(X) = \frac{1}{n-1}\sum_{i=1}^{n}(X_i - \bar{X})^2,$$
and the sample covariance between $X$ and $Y$ is:
$$\text{Cov}(X, Y) = \frac{1}{n-1}\sum_{i=1}^{n}(X_i - \bar{X})(Y_i - \bar{Y}).$$
The full matrix for two variables is then assembled by placing variances on the diagonal and the shared covariance in both off-diagonal slots:
$$\Sigma = \begin{bmatrix} \text{Var}(X) & \text{Cov}(X, Y) \ \text{Cov}(X, Y) & \text{Var}(Y) \end{bmatrix}.$$
The off-diagonal entries are identical because $\text{Cov}(X, Y) = \text{Cov}(Y, X)$ — which is why every covariance matrix is symmetric.
What does the covariance matrix look like for more than two variables?
It grows to match. Three variables give a $3 \times 3$ matrix, with the three variances down the diagonal and the three distinct pairwise covariances (each appearing twice, mirrored across the diagonal) filling the rest:
$$\Sigma = \begin{bmatrix} \text{Var}(X) & \text{Cov}(X,Y) & \text{Cov}(X,Z) \ \text{Cov}(X,Y) & \text{Var}(Y) & \text{Cov}(Y,Z) \ \text{Cov}(X,Z) & \text{Cov}(Y,Z) & \text{Var}(Z) \end{bmatrix}.$$
The pattern holds for any number of variables: a $k \times k$ symmetric matrix.
What Are the Properties of a Covariance Matrix?
Every covariance matrix obeys a short, reliable list of rules. Each one is worth knowing because together they constrain what a valid covariance matrix can even look like.
Square. For $k$ variables it is always $k \times k$.
Symmetric. $\Sigma^{T} = \Sigma$, because $\text{Cov}(X, Y) = \text{Cov}(Y, X)$. This connects directly to the broader idea of a symmetric matrix.
Positive semi-definite. For any vector $v$, $v^{T}\Sigma v \geq 0$. Intuitively, no combination of variables can have negative variance.
Diagonal entries are non-negative. Each is a variance, and a variance is never negative.
Real, non-negative eigenvalues. A direct consequence of being symmetric positive semi-definite — and the reason the matrix behaves well under eigenvector analysis.
Examples of Covariance Matrix
The set runs from computing a single covariance, through the divisor mistake, into a full 2×2 build, reading an existing matrix, checking symmetry, and a sign interpretation.
Example 1
Find $\text{Cov}(X, Y)$ for $X = {2, 4, 6}$ and $Y = {1, 3, 5}$.
Means: $\bar{X} = 4$, $\bar{Y} = 3$. Deviations multiplied and summed:
$$(2-4)(1-3) + (4-4)(3-3) + (6-4)(5-3) = (-2)(-2) + 0 + (2)(2) = 4 + 0 + 4 = 8.$$
Divide by $n - 1 = 2$:
$$\text{Cov}(X, Y) = \frac{8}{2} = 4.$$
Final answer: $\text{Cov}(X, Y) = 4$. Positive, so $X$ and $Y$ rise together.
Example 2
A common slip — recompute the sample covariance above using the right divisor.
Wrong attempt. A student computes the sum of deviation products correctly as $8$, then divides by $n = 3$ (the count of data points) to get $\text{Cov}(X, Y) = 8/3 \approx 2.67$. The arithmetic is clean — the divisor is the trap.
Correct. For a sample covariance, the divisor is $n - 1$, not $n$ (this is Bessel's correction, which removes the bias from estimating the mean from the same data). Only a population covariance divides by $n$.
$$\text{Cov}(X, Y) = \frac{8}{n-1} = \frac{8}{2} = 4.$$
Final answer: $\text{Cov}(X, Y) = 4$ for sample data. Dividing by $n$ would silently understate it — the result still "looks reasonable," which is exactly why the error survives unchecked.
Example 3
Build the 2×2 covariance matrix for $X = {2, 4, 6}$ and $Y = {1, 3, 5}$.
From Example 1, $\text{Cov}(X, Y) = 4$. Now the variances. $\text{Var}(X) = \tfrac{(2-4)^2 + 0 + (6-4)^2}{2} = \tfrac{4 + 0 + 4}{2} = 4$, and $\text{Var}(Y) = \tfrac{(1-3)^2 + 0 + (5-3)^2}{2} = \tfrac{4 + 0 + 4}{2} = 4$. Assemble:
$$\Sigma = \begin{bmatrix} 4 & 4 \ 4 & 4 \end{bmatrix}.$$
Final answer: the covariance matrix is $\begin{bmatrix} 4 & 4 \ 4 & 4 \end{bmatrix}$ — symmetric, as every covariance matrix must be.
Example 4
For $X = {10, 5}$ and $Y = {3, 9}$, the covariance matrix is reported as $\begin{bmatrix} 6.25 & -15 \ -15 & 9 \end{bmatrix}$. What does the off-diagonal sign tell you?
The off-diagonal entry is $\text{Cov}(X, Y) = -15$, a negative number.
Final answer: the negative covariance means $X$ and $Y$ move in opposite directions — when $X$ is above its mean, $Y$ tends to be below its mean, and vice versa. The diagonal entries $6.25$ and $9$ are the (population) variances of $X$ and $Y$.
Example 5
Is $M = \begin{bmatrix} 5 & 2 \ 3 & 8 \end{bmatrix}$ a valid covariance matrix?
A covariance matrix must be symmetric: the $(1,2)$ entry must equal the $(2,1)$ entry. Here $2 \neq 3$.
Final answer: No. Because $M$ is not symmetric, it cannot be a covariance matrix. A genuine covariance matrix would need equal off-diagonal entries.
Example 6
A covariance matrix has $\text{Cov}(X, Y) = 0$. What does that mean about $X$ and $Y$?
A zero off-diagonal entry means the two variables have no linear relationship — they are linearly uncorrelated.
Final answer: $X$ and $Y$ are uncorrelated (no linear tendency to move together). A diagonal covariance matrix — zeros everywhere off the diagonal — describes a set of mutually uncorrelated variables.
Why the Covariance Matrix Runs Modern Data Science
"Which direction does this cloud of data actually spread?"
The covariance matrix answers exactly that, and the answer drives some of the most-used techniques in computing and finance.
Principal component analysis (PCA). The eigenvectors of the covariance matrix point along the directions of greatest variance in the data; the eigenvalues measure how much variance lies along each. PCA uses this to compress hundreds of correlated features down to a handful of meaningful ones — the same matrix machinery behind face recognition, gene-expression analysis, and image compression.
Portfolio risk in finance. The covariance matrix of asset returns is the core input to Markowitz portfolio theory. Pairing assets whose returns have negative covariance is the mathematical statement of "don't put all your eggs in one basket" — diversification, written as off-diagonal entries.
The multivariate normal distribution. The bell curve generalises to many dimensions using the covariance matrix as its shape parameter — it sets the orientation and stretch of the probability cloud that underlies countless statistical models.
Where Students Trip Up on the Covariance Matrix
Mistake 1: Using the wrong divisor (n vs n − 1)
Where it slips in: Computing a sample covariance, the student divides the sum of deviation products by $n$ instead of $n - 1$.
Don't do this: Divide by the raw count $n$ when the data is a sample drawn from a larger population.
The correct way: Sample variance and covariance use $n - 1$ (Bessel's correction); only population values use $n$.
Mistake 2: Confusing the covariance matrix with the correlation matrix
Where it slips in: A student treats covariance values as if they were bounded between $-1$ and $1$ like correlations.
Don't do this: Read a covariance of $-15$ as "almost no relationship" because it is "close to $-1$ in spirit." Covariance is unbounded and scale-dependent.
The correct way: Covariance can be any real number and changes if you rescale the data (switch from metres to centimetres and it jumps by $10{,}000$). The correlation matrix is the covariance matrix standardised to the $[-1, 1]$ range. The memorizer who learned "correlation is between $-1$ and $1$" often wrongly applies that bound to covariance too.
Mistake 3: Forgetting the matrix must be symmetric
Where it slips in: When building a matrix by hand, the student fills the two off-diagonal slots with different values.
Don't do this: Write $\text{Cov}(X, Y)$ in the top-right but a different number in the bottom-left.
The correct way: $\text{Cov}(X, Y) = \text{Cov}(Y, X)$ always, so the off-diagonal entries are mirror images. The second-guesser who recomputes the covariance "the other way around" and gets a slightly different number has almost always made an arithmetic slip — the two are mathematically identical.
Key Takeaways
A covariance matrix is a square matrix with variances on the diagonal and covariances off the diagonal.
It is always symmetric ($\Sigma^{T} = \Sigma$) and positive semi-definite, with real non-negative eigenvalues.
Sample variance and covariance divide by $n - 1$; population versions divide by $n$ — the most common computational mistake.
Covariance is unbounded and scale-dependent, unlike the bounded correlation matrix it can be standardised into.
The eigenvectors of the covariance matrix drive PCA, and the matrix itself underpins portfolio risk models and the multivariate normal distribution.
Practice These Before Moving On
Find $\text{Cov}(X, Y)$ for $X = {1, 3, 5}$ and $Y = {2, 4, 6}$ as a sample.
Build the 2×2 sample covariance matrix for the data in Question 1.
State whether $\begin{bmatrix} 4 & -2 \ -2 & 9 \end{bmatrix}$ could be a valid covariance matrix, and explain why.
Answer to Question 1: means $\bar{X} = 3$, $\bar{Y} = 4$; deviation products $(-2)(-2) + 0 + (2)(2) = 8$; divide by $n-1 = 2$ to get $\text{Cov}(X, Y) = 4$. Answer to Question 2: $\text{Var}(X) = \text{Var}(Y) = 4$, so $\Sigma = \begin{bmatrix} 4 & 4 \ 4 & 4 \end{bmatrix}$. Answer to Question 3: yes — it is symmetric and its diagonal variances are positive. If Question 1 came out as $8/3$, return to Mistake 1 and recheck the divisor.
Want a live Bhanzu trainer to walk through more covariance matrix problems? Book a free demo class — online globally.
Was this article helpful?
Your feedback helps us write better content
