In the previous post, we introduced determinants of matrices (and by extension linear maps) via its multilinearity properties, and as the change-of-volume factor. We also discussed how to calculate them, via row operations, or Laplace expansion, or directly via a sum of products of entries over permutations.
The question of why this is ever a useful quantity to consider remains, and this post tries to answer it. We’ll start by seeing one example where this is a very natural quantity to consider, and then the main abstract setting, where the determinant is zero, and consider a particularly nice example of this.
Jacobeans as a determinant
We consider integration by substitution. Firstly, in one variable: when it comes to Riemann integration of a function g(x) with respect to x, we view dx as the width of a small column which approximates the function near x. Now, if we reparameterise, that is if we write x=f(y) for some well-behaved (in particular differentiable) function f, then the width of the column is dx= dy.(dx/dy)=f'(y) dy. This may be negative, if y is decreasing while x is increasing, but for now let’s not worry about this overly, for example by assuming the function g is non-negative. Thus if we want to integrate with y as the variable, we multiply the integrand by this factor .
What about in higher dimensions? We have exactly the same situation, only instead of two-dimensional columns, we have (n+1)-dimensional columns. We then multiply the n-dimensional volume of the base by the height, again given by . If we have a similar transformation of the base variable , we differentiate to get
In other words
where J is the Jacobean matrix of partial deriatives. In particular, we know how to relate the volume to the volume . It’s simply the determinant of the Jacobean J. So if we want to integrate with respect to , it only remains to pre-multiply the integrand by and proceed otherwise as in the one-dimensional case.
Det A = 0
A first linear algebra course might well motivate the introducing matrices as a notational shortcut for solving families of linear equations, . The main idea is that generally we can solve this equation uniquely. Almost all of the theory developed in such a first linear algebra course deals with the case when this fails to hold. In particular, there are many ways to characterise this case, and we list some of them now:
- Ax=b has no solutions for some b;
- A is not invertible;
- A has non-trivial kernel, that is, with dimension at least one;
- A does not have full rank, that is, the image has dimension less than n;
- The columns (or indeed the rows) are linearly dependent;
- The matrix can be row-reduced to a matrix with a row of zeroes.
It is useful that these are equivalent, as in abstract problems one can choose whichever interpretation from this list is most relevant. However, all of these are quite hard to check. Exhibiting a non-trivial kernel element is hard – one either has to do manual row-reduction, or the equivalent in the context of linear equations. But we can add the characterisation
- det A = 0;
to the list. And this is genuinely much easier to check for specific examples, either abstract or numerical.
Let’s quickly convince ourselves of a couple of these equivalences. Determinant is invariant under row-reductions, and by multilinearity it is certainly the case that det A = 0 if A has a row of zeroes. We also said that A is the change-of-volume factor. Note that A is a map from the domain to its image, so if A has less than full rank, then any set in the image has zero volume.
The Vandermonde matrix
This is a good example of this theory in practice. Consider the Vandermonde matrix where each row is a geometric progression:
Now suppose we attempt to solve
There’s a natural interpretation to this, that’s especially clear with this suggestive notation. Each row corresponds to a polynomial, where the coefficients are given by the , and the argument is given by .
So if we try to solve for , given and , we are asking whether we can find a polynomial P with degree at most n-1 such that for $i=1. Lagrange interpolation gives an argument where we just directly write down the relevant polynomial, but we can also deploy our linear algebraic arguments too.
The equivalence of all these statements means that to verify existence and uniqueness of such a polynomial, we only need to check that the Vandermonde matrix has non-zero determinant. And in fact there are a variety of methods to show that
For the polynomial question to be meaningful, we would certainly demand that the are distinct, and so this determinant is non-zero, and we’ve shown that n points determine a degree (n-1) polynomial uniquely.
If we multiply on the left instead, suppose that we are considering a discrete probability distribution X that takes n known values with unknown probabilities . Then we have
So, again by inverting the Vandermonde matrix (which is know is possible since its determinant is non-zero…) we can recover the distribution from the first (n-1) moments of the distribution.
A similar argument applies to show that the Discrete Fourier Transform is invertible, and in this case (where the s are roots of unity), the expression for the Vandermonde determinant is particularly tractable.