Positive Matrices as Generalizations of Positive Numbers

A conceptual intuition of positive matrices from which most core properties of definite matrices can be derived

This is the first part of what is hopefully a sequence of posts to help build intuition on linear algebra. I’m certainly not the first person to think of this - Gregory Gundersen has an extremely similar and arguably more intuitive post here, but I stumbled across the idea myself and was proud of it so I’m writing this here anyways.

Here’s an interesting, perhaps obvious property of positive numbers: when you multiply a positive number pp by any non-zero number aa, the result lies on the same side of the number line as the original number aa did. Let’s take the positive number 22 as an example: 23=66 is on the same side of the number line as 32(5)=10-10 is on the same side as -5 \begin{align*} 2 \cdot 3 = 6 &\qquad \text{6 is on the same side of the number line as 3}\\ 2 \cdot (-5) = -10 &\qquad \text{-10 is on the same side as -5} \end{align*}

In other words, positive numbers can be associated with sign-preserving functions. To formalize that last part, a positive number pp corresponds to a linear map p:RRp: \R \to \R defined by p(a)=pap(a) = p \cdot a (show for yourself this is linear!)

If you’ve taken a good linear algebra course, then you’ll know that matrices correspond to linear maps. Here’s where the crucial piece of intuition comes in: we view positive definite matrices as generalizations of the above characterization of positive numbers to Rk\R^k. A positive definite matrix PP corresponds to a linear map that sends any non-zero vector vv to another vector PvPv that lies on the same half of the vector space as vv was.

How do we define “same half of the vector space as vv”? With respect to vv itself! If you consider the set of all possible vectors that are orthogonal to vv, the entirety of this set takes the shape of a hyperplane which divides your vector space in half. Then the ‘side’ that vv is on is simply the set of vectors ww such that the inner product vw>0v^\top w > 0. From this and our conceptual generalization above, we can derive the first definition of positive definite matrices PP:

Definition 1: vPv>0,  v0v^\top P v > 0, \ \forall \ v \neq 0

In other words, we view the quadratic product vPvv^\top Pv as the inner product between vv and PvPv. Notice how in the positive number/1D case, the dot product reduces to multiplying scalars: for all a0,a \neq 0, \newline a(pa)=a2p>0a \cdot (p \cdot a) = a^2 p > 0, which indeed holds for positive pp.

Before we continue, we should note that this conceptual framework restricts positive matrices to being square matrices: in our 1D case, positive numbers mapped from R\R to R\R, and in higher dimensions it doesn’t really make sense to take inner products between vectors of different sizes. It should also follow (though not very obviously) that positive matrices must be symmetric. We know the quadratic product vPvv^\top P v is a scalar, so it is equal to its transpose vPvv^\top P^\top v. Then we can set