The Resultant, Episode 4
This episode has one sole purpose: to show that the two formulas for the resultant are equivalent. The next episode, the finale, will tie up some loose ends.
The formulas:
(1)
(2)
where
E(x) = amxm+···+a0 = am(x–u1)···(x–um) (3a)
F(x) = bnxn+···+b0 = bn(x–v1)···(x–vn) (3b)
Eq.(2) follows immediately from (1) once we expand either F(ui) or E(vj) using the right hand side of (3). We assume that m,n>0.
We learned in Episodes 2 and 3 that the equation
PE+QF=det(S) (4)
always has a solution with deg P<n, deg Q<m, P and Q nonzero, coefficients in R. (Succinctly: P∈Rn[x], Q∈Rm[x]. We had one proof in the singular case, another for nonsingular S.) Eq.(4) provides a crucial ingredient.
Here are the bones of the proof of eq.(1); flesh on the bones to follow. For some pair ui, vj, set ui=vj. If x=ui=vj, then E=F=0 from the right hand side of (3). Then (4) tells us that det(S)=0. From the factor theorem we conclude that (ui–vj) is a factor of det(S). Since ui and vj were arbitrarily chosen, all the factors of the product divide det(S), and so the product does. Comparing degrees, we can show that det(S) equals the product.
Now let’s dot some i’s and cross some t’s (switching to a less visceral metaphor). We can cast the argument in a concrete 19th century style, or take a more modern structural approach. We’ll do both together. Start with a special case, where the leading coefficients am and bn are both 1. First piece of business: treat the ui‘s and vj‘s as formal symbols. Expanding out the right hand sides of eq.(3) we get expressions for the ai‘s and bj‘s as polynomials in them, the so-called elementary symmetric polynomials:
(5a)
where the hat means “omit this factor”, and likewise for the bj‘s (eq.(5b), which I won’t bother to write out). Plugging these ai‘s and bj‘s into det(S), we obtain a big, elaborate polynomial in the ui‘s and vj‘s with integer coefficients.
Structurally, we’re working in the ring R=k[u1,…,um, v1,…,vn], where the ui‘s and vj‘s are variables. The ai‘s and bj‘s and det(S) are all elements of R.
What happens to this det(S) if, say, you replace u1 everywhere with v1? We have to rewrite eq.(5a), but (5b) doesn’t change. Likewise for (3a) and (3b). Next, let’s substitute v1 for x. The right hand side of (3b) becomes identically 0, implying that F(v1), fully expanded, is identically 0. The modified right hand side of (3a) (with v1 in place of u1) also is identically 0, so E(v1), is identically 0. And finally from (4) we conclude that if you replace u1 with v1 everywhere in the polynomial det(S), the result is identically 0.
Structurally, we pull out an old trick from our toolbox: we regard R=k[u1,…,um, v1,…,vn] as R=k[u2,…,um, v1,…,vn][u1]. Write R1 for k[u2,…,um, v1,…,vn], so R=R1[u1]. Replacing u1 with v1 just means evaluating the polynomials of R1[u1] at the element v1 of R1. Previously we’ve called this an evaluation homomorphism R1[u1]→R1. It extends canonically to a homomorphism from R[x] to R1[x]. Under this map, E(x) goes to the polynomial (x–v1)(x–u2)···(x–um). We also have an evaluation homomorphism from R1[x] to R1, where we set x=v1. That sends both E and F to 0, so the image of det(S) is 0, by eq.(4). But det(S) has no x‘s in it, so this is the same as applying the homomorphism R1[u1]→R1 to det(S). The upshot: the polynomial det(S) of R=R1[u1] has the root v1 in R1.
Now we can appeal to the factor theorem, and conclude that det(S) is divisible by (u1–v1) in R. I should mention that the factor theorem still holds even for polynomials over a ring, and not a field. The proof amounts to using long division by a linear polynomial of the form x–a (or u1–v1 in our present circumstances); since the leading coefficient is 1, we never need inverses in the ring. This explicit long division serves as the concrete argument.
Because u1 and v1 were arbitrary, the same rigamarole shows that det(S) is divisible by (ui–vj) for any i and j. Indeed, det(S) is divisible by the product of all these factors; you can show this by induction, but perhaps a slicker approach is to note that R is a UFD, and that all the (ui–vj)’s are irreducible and no two are associates.
OK, now we have , with h∈R. Next we compare degrees. The product consists of mn factors, each of degree 1 in the variables (the ui‘s and vj‘s), so when expanded, it’s homogeneous of degree mn. If we can show the same for det(S), then it will follow that h is a constant.
As it happens, Kendig provides the argument we need on p.66. Let’s say we replace ui and vj with tui and tvj. If we can show that det(S) turns into tmn det(S), it will follow that det(S) is homogeneous of degree mn. Looking at the elementary symmetric polynomials (5a), we see that a0 has degree m, a1 is homogeneous of degree m–1, and in general ai is homogeneous of degree m–i. Likewise bj is homogeneous of degree n–j. The Sylvester matrix (for m=2 and n=3, with leading coefficients a2=b3=1) looks like:
So after we replace ui and vj with tui and tvj, the entries are multiplied elementwise by this matrix:
It’s hard to guess what happens to the determinant as a whole from this. We improve matters by (as Kendig puts it) packing the matrix with additional powers of t. Multiply each row by a power of t to make the columns uniform, thus:
Since multiplying a column multiplies a determinant by the same factor, this matrix will multiply det(S) by t1+2+3+4 = t10, in our special case. We packed with additional factors t1+2+1 = t4, again for this case. Net effect, multiplying ui and vj by t multiplies det(S) by t6, as desired. Kendig does the (straightforward) algebra for general m and n. So the determinant is homogeneous of degree mn in the ui‘s and vj‘s.
Our goal is in sight. We’ve shown h is constant, now to show it’s 1. For this, we expand the product, picking a term to focus on; then we focus on the same term in the determinant.
Say we choose vj in each factor (ui–vj) of the product; that gives us one term of the expansion. Each vj is paired with m ui‘s, so our result is (–1)mnv1m ··· vnm = ((–1)n v1 ··· vn)m = b0m. In det(S), if we go down the main diagonal we get b0m. All the other terms in the “sum of products” formula for the determinant involve replacing at least one of the b0‘s with a bj, j>0. That means at least one vj is replaced with a ui. So the main diagonal is the only term of the determinant formula contributing a term that is all v‘s. Therefore the constant h must be 1.
What about the general case, with arbitrary leading coefficients am and bn? Throw them in as additional variables; let’s just call them a and b. The ai‘s and bj‘s acquire an additional factor; that is, the “new” ai is aai and the “new” bj is bbj. So the “new” det(S) is anbm times the “old” det(S). That’s exactly the factor in front of the double product in (1). Since (1) held before (without the anbm on the right), it remains true after multiplying both sides by anbm.
Closing remarks: we’ve actually shown that (1) is a polynomial identity in the ring k[a, b, u1,…,um, v1,…,vn]. (We didn’t even use the fact that k is a field.)