The Resultant, Episode 2

By now you know the characters: the polynomials *E*(*x*) (degree *m*) and *F*(*x*) (degree *n*) with coefficients in an integral domain *R*, its fraction field *K*, and the extension field *L* of *K* in which *E* and *F* split completely:

*E*(*x*) = *a*(*x–u*_{1})···(*x–u _{m}*),

*F*(

*x*) =

*b*(

*x–v*

_{1})···(

*x–v*)

_{n}And we have our resultant res* _{x}*(

*E*,

*F*), an element of

*R*, which is 0 if and only if

*E*and

*F*have a common root in

*L*. When

*R*=

*k*[

*y*] and

*K*=

*k*(

*y*), that common root would be an algebraic function of

*y*—what we’ve been calling a

*branch*.

We assume, by the way, that *m* and *n* are both positive.

(In the Inside-the-Episode example, with the roles of *x* and *y* switched, we have the branches for *E*(*y*)=*y*^{2}+(*x*^{2}–1) and for *F*(*y*)=*y*^{2}+(2*x*^{2}–1). The resultant is *not* zero, because the ellipses *E* and *F* don’t share a branch. Instead it’s res* _{y}*(

*E*,

*F*)=

*x*

^{4}.)

Even when *E* and *F* have no common branch, the resultant still has news to impart, when *R*=*k*[*y*]. We can have *u _{i}*(

*y*)=

*v*(

_{j}*y*) at particular

*values*of

*y*(for some

*i*and

*j*). So res

*(*

_{x}*E*,

*F*), a polynomial in

*y*, tells us the possible values of

*y*at the intersections of the curves

*E*and

*F*. (Inside-the-Episode example: the intersections occur at

*x*

^{4}=0.)

Let’s look at the abstract situation, and see what we can glean from that. Our theme: “As above, so below.” That is, the resultant belongs to *R*, while the common root (if any) belongs to *L*. What other assertions about *L* translate down to *R*?

First observation: *E*(*x*) and *F*(*x*) have a common *root* in *L* if and only if they have a common *factor* over *L* that’s not a constant.

Obviously a common root implies a nonconstant common factor, by the so-called factor theorem: if *r* is a root of *f*(x), then (*x–r*) is a factor. But the converse is also easy: since *E*(*x*) factors completely in *L*, so does a nonconstant common factor, since *L*[*x*] is a UFD (because *L* is a field). So our common factor has a root in *L*, which naturally must also be a root of *F*(*x*).

“Having a common root” doesn’t translate between *R* and *L*, or even between *K* and *L*. But “having a nonconstant common factor” might. Obviously it translates *upwards*. How about *downwards*? In other words, if *E*(*x*) and *F*(*x*) have *no* nonconstant common factor in *R*[*x*] (or *K*[*x*])—if they are coprime there—are they coprime in *L*[*x*]?

Remarkably, yes, when *R* is a UFD. (Remarkable because being *irreducible* isn’t preserved as we “go upwards”. Standard example: *x*^{2}+1 is irreducible in ℝ[*x*] but not in ℂ[*x*].) From Gauss’s lemma we get: “coprime in *R*[*x*]” implies “coprime in *K*[*x*]”. Now, *K*[*x*] is a PID, so if *E* and *F* are coprime, then *pE+qF*=1 for some polynomials *p*,*q*∈*K*[*x*]. But that equation still holds in *L*[*x*], and it implies that any common factor of *E* and *F* divides 1. So it’s a unit, and *E* and *F* are coprime in *L*[*x*].

The expression *pE+qF* has made its entrance onto the stage! It will play a major role in the rest of “The Resultant” miniseries, and even beyond.

Observe: if *E* and *F* are *not* coprime, then *E*=*AB* and *F*=*BC* for some *A*≠0, *C*≠0, and nonconstant *B*. So *EC*=(*AB*)*C*=*A*(*BC*)=*AF*, or *EC–AF*=0. Therefore there are nonzero polynomials *p* and *q* such that

*pE+qF*=0

with deg(*p*)<deg(*F*), deg(*q*)<deg(*E*)

Conversely, if we have *pE+qF*=0 with the degree inequalities, then *pE=–qF*, so *E* divides *qF*. *E* doesn’t divide *q *because deg(*q*)<deg(*E*). Using unique factorization, it follows that some irreducible (nonconstant) factor of *E* divides *F*, so *E* and *F* are not coprime. A bonus: this argument works equally well in *R*[*x*], *K*[*x*], and *L*[*x*].

The upshot: being coprime and not being coprime both boil down to the existence of certain polynomials in *K*[*x*], and if they exist down in *K*[*x*], they also exist up in *L*[*x*].

The “*pE+qF*=1″ argument relied on *K*[*x*] being a PID. When *R*=*k*[*y*], the integral domain *R*[*x*] (aka *k*[*x*,*y*]) isn’t a PID, just a UFD. But the equation *pE+qF*=1 in *K*[*x*] still tells us something about *R*[*x*]. Namely, *K* is the fraction field of *R*, so we can clear the denominators in the coefficients of *p* and *q*, getting *PE+QF=D* in *R*[*x*], with *D* in *R*.

Remember the coordinate ring Γ(*E*∩*F*) of post 7? The quotient *k*[*x*,*y*]/(*E*,*F*)? Notice that the ideal (*E*,*F*) of *k*[*x*,*y*] is the set {*PE+QF* : *P,Q*∈*k*[*x*,*y*]}, and its smallest enclosing ideal in *K*[*x*] is {*pE+qF* : *p,q*∈*K*}, where *K*=*k*(*y*).

With all this as motivation, we’re ready for the final plot twist of Episode 2. Let’s look at the homomorphism

*K*[*x*]⊕*K*[*x*] → *K*[*x*]

⟨*p,q*⟩ ↦ *pE+qF*

Call it Φ. If we regard *K*[*x*]⊕*K*[*x*] and *K*[*x*] as vector spaces over *K*, then Φ is a linear operator. This turns our thoughts to its null space, its image, stuff like that. *K*[*x*] is infinite-dimensional over *K*. Which is OK, but finite dimensional vector spaces are easier to deal with. But now recall the degree inequalities from above: deg(*p*)<deg(*F*)=*n*, deg(*q*)<deg(*E*)=*m*. Let’s write *K _{m}*[

*x*] for the space of all polynomials over

*K*of degree <

*m*, likewise

*K*[

_{n}*x*]; these have dimensions

*m*and

*n*as vector spaces over

*K*. So we can restrict Φ to get

*K _{n}*[

*x*]⊕

*K*[

_{m}*x*] →

*K*[

_{m+n}*x*]

⟨

*p,q*⟩ ↦

*pE+qF*

We observed earlier that *E* and *F* are *not* coprime if and only if *pE+qF*=0 for some nonzero *p*∈*K _{n}*[

*x*] and

*q*∈

*K*[

_{m}*x*]. Our restricted Φ is singular if and only if

*E*and

*F*are not coprime.

It’s easy enough to write the matrix for the restricted Φ, over the bases {1,…,*x*^{n–1}; 1,…,*x*^{m–1}} and {1,…,*x*^{m+n–1}}. You might think I repeated some elements when I wrote {1,…,*x*^{n–1}; 1,…,*x*^{m–1}}. Not so. We get the basis of *K _{n}*[

*x*]⊕

*K*[

_{m}*x*] by

*concatenating*the bases of

*K*[

_{n}*x*] and

*K*[

_{m}*x*]. So (for example) those two 1’s represent different basis elements, namely ⟨1,0⟩ and ⟨0,1⟩, sent by Φ to

*E*and

*F*respectively.

Suppose *E*(*x*)=*a*_{0}+···+*a _{m}x^{m}*,

*F*(

*x*)=

*b*

_{0}+···+

*b*. Here’s the matrix, picking the special case

_{n}x^{n}*m*=2 and

*n*=3, with the bases shown above and at right:

Blank entries are zeros. For example, the basis element ⟨*x*,0⟩, topping the second column, is sent to *a*_{0}*x*+*a*_{1}*x*^{2}+*a*_{2}*x*^{3}; the basis element ⟨0,*x*⟩, topping the last column, is sent to *b*_{0}*x*+*b*_{1}*x*^{2}+*b*_{2}*x*^{3}+*b*_{3}*x*^{4}.

This is called the *Sylvester matrix*. (Usually transposed, and often with the rows/columns rearranged. Conventions differ. I’ll switch to the traditional form in the next post.)

The determinant of the Sylvester matrix is therefore zero, when and only when *E* and *F* have a common nonconstant factor. Hey, that’s just like the resultant! Maybe the Sylvester determinant *equals* the resultant?

Indeed it does. (Within a factor of ±1, depending on conventions). We’ll see why in Episode 4.