The trichotomy of cardinals says that for any πͺ and π«, exactly one of these holds: πͺ<π«, πͺ=π«, or πͺ>π«. It’s equivalent to the conjunction of these two propositions, for any two cardinals πͺ and π«:
- Antisymmetry:
- If πͺβ€π« and π«β€πͺ, then πͺ=π«.
- Comparability:
- Either πͺβ€π« or π«β€πͺ.
Cantor showed the trichotomy of ordinals via transfinite induction. I outlined the proof at the end of post 5. If we assume that every set can be well-ordered, then trichotomy for cardinals follows immediately. For Cantor, this well-ordering assumption was a βlaw of thoughtβ.
We can eliminate the notion of βcardinal numberβ from the statement of trichotomy. Write Xβ€Y if there is an injection from X to Y, and Xβ‘Y if there is a bijection. Antisymmetry says that if Xβ€Y and Yβ€X, then Xβ‘Y. Comparability says that either Xβ€Y or Yβ€X.
In 1915, Hartogs showed that trichotomy implies the well-ordering theorem and hence the axiom of choice. But antisymmetry is provable without using Choice. Once upon a time, people called this the SchrΓΆder-Bernstein theorem: in 1898 proofs appeared by SchrΓΆder and Bernstein (independently). Alas, SchrΓΆder’s proof suffered a fatal flaw. A proof by Dedekind turned up in his posthumous papers; he had also sent it to Cantor, who apparently didn’t notice it. Zermelo rediscovered Dedekind’s proof. So: SchrΓΆder-Bernstein? Cantor-Bernstein? Dedekind-Bernstein? Cantor-Dedekind-Berstein? Just Bernstein? Take your pick. I will call it the Equivalence Theorem.
SchrΓΆder’s βproofβ is elegant, at least. Say f:XβY and g:YβX are the injections. Let X0=X and Y0=Y, and inductively define Yn+1=f(Xn) and Xn+1=g(Yn). Then
X0 β X1Β β X2β¦
X0 β‘ X2 β‘ X4β¦
X1 β‘ X3 β‘ X5β¦
So βnX2n=βnX2n+1. SchrΓΆder now asserts that the cardinality of βnX2n is the same as the common cardinality of all the X2n, and likewise for βnX2n+1. So X0β‘X1β‘Y0. But his assertion is false: we can have βnX2n=βnX2n+1=β , even with X and Y infinite. For example, let X=Y=β, and f(n)=g(n)=n+1.
Dedekind’s proof uses an equivalent formulation. Since Y is equivalent to a subset of X, we might as well assume that Y actually is this subset. Then the injection f:XβY is a map from X to itself. We have XΒ β YΒ β f(X). We need to show Xβ‘Y.
Let P=XβY and Q=Yβf(X). Then
X=P β Q β f(P) β f(Q) β f(f(P)) β f(f(Q))β¦
Y=Β Β Β Β Q β f(P) β f(Q) β f(f(P)) β f(f(Q))β¦
Define h by h(x)=f(x) for xβP β f(P) β f(f(P))β¦, and h(x)=x for xβQ β f(Q) β f(f(Q))β¦. Verify that h is a bijection between X and Y. QED
An elegant proof, due to Julius KΓΆnig, often appears in textbooks. It’s longer than Dedekind’s proof; I will drag it out even more by offering some motivation.
In the figure above, sets X and Y are indicated with the injections f:XβY (blue) and g:YβX (red). We want to construct a 1β1 pairing between all of X and Y out of these two partial pairing. Each edge, red or blue, is a possible pairing. So a node like b has two possible partners: 1 or 3. However, a node like a has only one possible partner, namely 2. Likewise, 1’s only possible partner is b.
That means we have to start by pairing off nodes without preimages: nodes xβX for which gβ1(x) doesn’t exist, and nodes yβY for which fβ1(y) doesn’t exist. Let’s call this the first round of pairings.
Once we’ve paired off a2 and 1b, it then turns out that c and 3 each have only one possible choice (second round). And so on. In the figure, this is enough to complete the bijection. (Thickened edges indicate the pairings actually made.)
If you think about it, this procedure amounts to following the f and g edges backwards until you can’t go any further. Call this the backwards chain or just the chain. Suppose it starts with x=x1βX and terminates in Y, say x1,y1,x2,y2β¦,xn,yn. (So gβ1(x1)=y1, fβ1(y1)=x2, etc., and fβ1(yn) doesn’t exist.) In round 1, we must pair yn with xn=g(yn). In round 2, we must pair ynβ1 with xnβ1=g(ynβ1). Eventually we must pair y1 with x1=g(y1). In other words, we pair x1 with y1=gβ1(x1). We are forced to use gβ1 for each pair xiyi.
On the other hand, if the chain starts and ends in Xβsay, x=x0,y1,x1, β¦,yn,xn, where gβ1(xn) doesn’t existβthen we must pair xn with yn=f(xn), and then must pair xnβ1 with ynβ1=f(xnβ1), eventually pairing x1 with y1=f(x1) and x0 with f(x0).
If the chain never terminates, then we have a choice. We can pair x with f(x) or with gβ1(x). To make a definite rule, let’s always use gβ1(x).
Summarizing, we have this definition of the bijection h:
h(x) = f(x) if the chain terminates in X
h(x) = gβ1(x) if the chain doesn’t terminate
h(x) = gβ1(x) if the chain terminates in Y
We know that gβ1(x) exists in the last two cases, for if it didn’t, then the chain would terminate in X.
So we’ve partitioned X into three parts, X = X1 β X2 β X3:
X1={xβX : the chain terminates in X}
X2={xβX : the chain doesn’t terminate}
X3={xβX : the chain terminates in Y}
and f is used on X1 while gβ1 is used on X2 and X3. Now we have to show that h is a bijection.
First note that we get the chain for f(x) by prepending f(x) to the chain for x. The termination behavior is unaffected by this prepending, so f(Xi) is contained in Yi (i=1,2,3) where
Y1={yβY : the chain terminates in X}
Y2={yβY : the chain doesn’t terminate}
Y3={yβY : the chain terminates in Y}
Likewise, we get the chain for g(y) by prepending g(y) to the chain for y. So g(Yi) is contained in Xi (i=1,2,3). So both f and g βstay in their lanesβ: a blue f-edge cannot cross from Xi to Yj with iβ j, and likewise for the red g-edges. So the same is true for fβ1 and gβ1, wherever they are defined.
Now, f is injective, and gβ1 is injective on its domain (i.e., on g(Y)), and βthe streams can’t crossβ: if xβX1 and xβ²βX2βX3, then we can’t have h(x)=h(xβ²) because h(x)=f(x) is in Y1 and h(xβ²)=gβ1(xβ²) is in Y2βY3.
Note that f maps X1 onto Y1, because if fβ1(y) doesn’t exist, then y belongs to Y3. Also gβ1 maps X2βX3 onto Y2βY3: If yβY2βY3, then g(y) belongs to X2βX3 and gβ1 maps it to y. QED
There is a version of the Equivalence Theorem for linearly ordered sets, due to Lindenbaum. Note that an order-preserving map f:XβY of ordered sets is automatically injective: if xβ xβ², then either x<xβ² or xβ²<x, so either f(x)<f(xβ²) or f(xβ²)<f(x) and so f(x)β f(xβ²).
We can have order-preserving injections f:XβY and g:YβX without any order-isomorphism between X and Y. Simple example: X is the closed interval [0,1] and Y is the open interval (0,1) (in either β or β). But we have the following:
Theorem (Lindenbaum): If f:XβY is an order-preserving map onto an initial segment of Y, and g:YβX is an order-preserving map onto a final segment of X, then there is an order-isomorphism between X and Y.
Proof: We will show that the h constructed above works. We already know it’s a bijection, so we just have to show that it’s order-preserving.
Since f and g are order-preserving, we are left with this to prove: if xβX1 and xβ²βX2 β X3, then x<xβ² and f(x)<gβ1(xβ²).
Observe that all elements of f(X) precede all elements of the complement Yβf(X), since f maps onto an initial segment of Y. Likewise all elements of g(Y) follow all elements of Xβg(Y).
Now, if x belongs Xβg(Y), then the chain for x terminates in X, so xβX1. For any xβ²βX2βX3 the preimage gβ1(xβ²) must exist, since its chain doesn’t terminate in X. So xβ²βg(Y) and we have x<xβ² in this special case.
The figure above illustrates how to extend this to the general case of any xβX1. It shows x=aβX1 and xβ²=cβX3. The terminal nodes in the chain, b and 4, are circled. Following the chain two steps gives us bβXβg(Y) and dβX3. By the previous paragraph, b<d. But f and g are order-preserving, so g(f(b))<g(f(d)). That is, a<c.
What if we had cβX2 instead of X3? The argument is not affected. What if, traveling along the two chains in sync, the chain for xβ² terminates before the chain for x? That would happen, for example, if 2 in the figure was a terminal node, without a preimage fβ1(2). But then 1<2 because 1 belongs to f(X) and 2 belongs to Yβf(X). So g(1)=a precedes g(2)=c.
Clearly this reasoning holds for any xβX1 and xβ²βX2βX3.
It remains to show that h(x)=f(x) precedes h(xβ²)=gβ1(xβ²). Once again we can follow the two chains in sync until we hit a terminal node for one of them. Because f and g are order-preserving, we are reduced to proving two things:
- If xβXβg(Y) and xβ²βX2βX3, then f(x)<gβ1(xβ²).
- If yβY1 and yβ²βYβf(X), then y<yβ².
In both cases, we can βascend the ladderβ. That is, each bullet says that an f-edge has its Y-vertex above the Y-vertex of a gβ1-edge. In the figure, the f-edges slope upwards and are blue, while the gβ1-edges slope downward and are red. Applying fβ¬g to the Y-vertex and gβ¬f to the X-vertex moves us up βone rungβ. So the general case, where x is any element of X1 and xβ² is any element of X2βX3, follows from these two bullets.
So let’s prove the first bullet. The argument (a proof by contradiction) is a bit tedious. This figure will make it easier to follow:
Suppose gβ1(xβ²)<f(x). Since f maps X onto an initial segment of Y, this means that gβ1(xβ²) also belongs to f(X). Say gβ1(xβ²)=f(xβ³). Since f is order-preserving, and f(xβ³)<f(x), we have xβ³<x. Now, xβ²βX2βX3 and xβ³ belongs to the xβ²-chain, so xβ³βX2βX3 and is therefore in g(Y). But g maps Y onto a final segment of X, and xβ³<x, so x must also belong to g(Y). This contradicts the assumption that xβXβg(Y).
The second bullet is a breeze. As noted already, all elements of f(X) precede all elements of Yβf(X). We’ve also seen that all elements of Y1 belong to f(X). QED
Rosenstein (pp.22β23) provides another proof, patterned after Dedekind’s proof of the Equivalence Theorem.
In recursive function theory, the Myhill Isomorphism Theorem is analogous to the Equivalence Theorem.
