Set Theory Jottings 7. The (Cantor-Dedekind-SchrΓΆder)-Bernstein Theorem

Prev TOC Next

The trichotomy of cardinals says that for any π”ͺ and 𝔫, exactly one of these holds: π”ͺ<𝔫, π”ͺ=𝔫, or π”ͺ>𝔫. It’s equivalent to the conjunction of these two propositions, for any two cardinals π”ͺ and 𝔫:

Antisymmetry:
If π”ͺ≀𝔫 and 𝔫≀π”ͺ, then π”ͺ=𝔫.
Comparability:
Either π”ͺ≀𝔫 or 𝔫≀π”ͺ.

Cantor showed the trichotomy of ordinals via transfinite induction. I outlined the proof at the end of post 5. If we assume that every set can be well-ordered, then trichotomy for cardinals follows immediately. For Cantor, this well-ordering assumption was a β€œlaw of thought”.

We can eliminate the notion of β€œcardinal number” from the statement of trichotomy. Write X≀Y if there is an injection from X to Y, and X≑Y if there is a bijection. Antisymmetry says that if X≀Y and Y≀X, then X≑Y. Comparability says that either X≀Y or Y≀X.

In 1915, Hartogs showed that trichotomy implies the well-ordering theorem and hence the axiom of choice. But antisymmetry is provable without using Choice. Once upon a time, people called this the SchrΓΆder-Bernstein theorem: in 1898 proofs appeared by SchrΓΆder and Bernstein (independently). Alas, SchrΓΆder’s proof suffered a fatal flaw. A proof by Dedekind turned up in his posthumous papers; he had also sent it to Cantor, who apparently didn’t notice it. Zermelo rediscovered Dedekind’s proof. So: SchrΓΆder-Bernstein? Cantor-Bernstein? Dedekind-Bernstein? Cantor-Dedekind-Berstein? Just Bernstein? Take your pick. I will call it the Equivalence Theorem.

SchrΓΆder’s β€œproof” is elegant, at least. Say f:Xβ†’Y and g:Yβ†’X are the injections. Let X0=X and Y0=Y, and inductively define Yn+1=f(Xn) and Xn+1=g(Yn). Then

X0 βŠ‡ X1Β βŠ‡ X2…
X0 ≑ X2 ≑ X4…
X1 ≑ X3 ≑ X5…

So β‹‚nX2n=β‹‚nX2n+1. SchrΓΆder now asserts that the cardinality of β‹‚nX2n is the same as the common cardinality of all the X2n, and likewise for β‹‚nX2n+1. So X0≑X1≑Y0. But his assertion is false: we can have β‹‚nX2n=β‹‚nX2n+1=βˆ…, even with X and Y infinite. For example, let X=Y=β„•, and f(n)=g(n)=n+1.

Dedekind’s proof uses an equivalent formulation. Since Y is equivalent to a subset of X, we might as well assume that Y actually is this subset. Then the injection f:Xβ†’Y is a map from X to itself. We have XΒ βŠ‡ YΒ βŠ‡ f(X). We need to show X≑Y.

Let P=Xβˆ–Y and Q=Yβˆ–f(X). Then

X=P βŠ” Q βŠ” f(P) βŠ” f(Q) βŠ” f(f(P)) βŠ” f(f(Q))…
Y=Β  Β  Β  Β  Q βŠ” f(P) βŠ” f(Q) βŠ” f(f(P)) βŠ” f(f(Q))…

Define h by h(x)=f(x) for x∈P βŠ” f(P) βŠ” f(f(P))…, and h(x)=x for x∈Q βŠ” f(Q) βŠ” f(f(Q))…. Verify that h is a bijection between X and Y. QED

An elegant proof, due to Julius KΓΆnig, often appears in textbooks. It’s longer than Dedekind’s proof; I will drag it out even more by offering some motivation.

In the figure above, sets X and Y are indicated with the injections f:Xβ†’Y (blue) and g:Yβ†’X (red). We want to construct a 1–1 pairing between all of X and Y out of these two partial pairing. Each edge, red or blue, is a possible pairing. So a node like b has two possible partners: 1 or 3. However, a node like a has only one possible partner, namely 2. Likewise, 1’s only possible partner is b.

That means we have to start by pairing off nodes without preimages: nodes x∈X for which gβˆ’1(x) doesn’t exist, and nodes y∈Y for which fβˆ’1(y) doesn’t exist. Let’s call this the first round of pairings.

Once we’ve paired off a2 and 1b, it then turns out that c and 3 each have only one possible choice (second round). And so on. In the figure, this is enough to complete the bijection. (Thickened edges indicate the pairings actually made.)

If you think about it, this procedure amounts to following the f and g edges backwards until you can’t go any further. Call this the backwards chain or just the chain. Suppose it starts with x=x1∈X and terminates in Y, say x1,y1,x2,y2…,xn,yn. (So gβˆ’1(x1)=y1, fβˆ’1(y1)=x2, etc., and fβˆ’1(yn) doesn’t exist.) In round 1, we must pair yn with xn=g(yn). In round 2, we must pair ynβˆ’1 with xnβˆ’1=g(ynβˆ’1). Eventually we must pair y1 with x1=g(y1). In other words, we pair x1 with y1=gβˆ’1(x1). We are forced to use gβˆ’1 for each pair xiyi.

On the other hand, if the chain starts and ends in Xβ€”say, x=x0,y1,x1, …,yn,xn, where gβˆ’1(xn) doesn’t existβ€”then we must pair xn with yn=f(xn), and then must pair xnβˆ’1 with ynβˆ’1=f(xnβˆ’1), eventually pairing x1 with y1=f(x1) and x0 with f(x0).

If the chain never terminates, then we have a choice. We can pair x with f(x) or with gβˆ’1(x). To make a definite rule, let’s always use gβˆ’1(x).

Summarizing, we have this definition of the bijection h:

h(x) = f(x) if the chain terminates in X
h(x) = gβˆ’1(x) if the chain doesn’t terminate
h(x) = gβˆ’1(x) if the chain terminates in Y

We know that gβˆ’1(x) exists in the last two cases, for if it didn’t, then the chain would terminate in X.

So we’ve partitioned X into three parts, X = X1 βŠ” X2 βŠ” X3:

X1={x∈X : the chain terminates in X}
X2={x∈X : the chain doesn’t terminate}
X3={x∈X : the chain terminates in Y}

and f is used on X1 while gβˆ’1 is used on X2 and X3. Now we have to show that h is a bijection.

First note that we get the chain for f(x) by prepending f(x) to the chain for x. The termination behavior is unaffected by this prepending, so f(Xi) is contained in Yi (i=1,2,3) where

Y1={y∈Y : the chain terminates in X}
Y2={y∈Y : the chain doesn’t terminate}
Y3={y∈Y : the chain terminates in Y}

Likewise, we get the chain for g(y) by prepending g(y) to the chain for y. So g(Yi) is contained in Xi (i=1,2,3). So both f and g β€œstay in their lanes”: a blue f-edge cannot cross from Xi to Yj with iβ‰ j, and likewise for the red g-edges. So the same is true for fβˆ’1 and gβˆ’1, wherever they are defined.

Now, f is injective, and gβˆ’1 is injective on its domain (i.e., on g(Y)), and β€œthe streams can’t cross”: if x∈X1 and xβ€²βˆˆX2βŠ”X3, then we can’t have h(x)=h(xβ€²) because h(x)=f(x) is in Y1 and h(xβ€²)=gβˆ’1(xβ€²) is in Y2βŠ”Y3.

Note that f maps X1 onto Y1, because if fβˆ’1(y) doesn’t exist, then y belongs to Y3. Also gβˆ’1 maps X2βŠ”X3 onto Y2βŠ”Y3: If y∈Y2βŠ”Y3, then g(y) belongs to X2βŠ”X3 and gβˆ’1 maps it to y. QED

There is a version of the Equivalence Theorem for linearly ordered sets, due to Lindenbaum. Note that an order-preserving map f:X→Y of ordered sets is automatically injective: if x≠x′, then either x<x′ or x′<x, so either f(x)<f(x′) or f(x′)<f(x) and so f(x)≠f(x′).

We can have order-preserving injections f:Xβ†’Y and g:Yβ†’X without any order-isomorphism between X and Y. Simple example: X is the closed interval [0,1] and Y is the open interval (0,1) (in either β„š or ℝ). But we have the following:

Theorem (Lindenbaum): If f:X→Y is an order-preserving map onto an initial segment of Y, and g:Y→X is an order-preserving map onto a final segment of X, then there is an order-isomorphism between X and Y.

Proof: We will show that the h constructed above works. We already know it’s a bijection, so we just have to show that it’s order-preserving.

Since f and g are order-preserving, we are left with this to prove: if x∈X1 and xβ€²βˆˆX2 βŠ” X3, then x<xβ€² and f(x)<gβˆ’1(xβ€²).

Observe that all elements of f(X) precede all elements of the complement Yβˆ–f(X), since f maps onto an initial segment of Y. Likewise all elements of g(Y) follow all elements of Xβˆ–g(Y).

Now, if x belongs Xβˆ–g(Y), then the chain for x terminates in X, so x∈X1. For any xβ€²βˆˆX2βŠ”X3 the preimage gβˆ’1(xβ€²) must exist, since its chain doesn’t terminate in X. So xβ€²βˆˆg(Y) and we have x<xβ€² in this special case.The figure above illustrates how to extend this to the general case of any x∈X1. It shows x=a∈X1 and xβ€²=c∈X3. The terminal nodes in the chain, b and 4, are circled. Following the chain two steps gives us b∈Xβˆ–g(Y) and d∈X3. By the previous paragraph, b<d. But f and g are order-preserving, so g(f(b))<g(f(d)). That is, a<c.

What if we had c∈X2 instead of X3? The argument is not affected. What if, traveling along the two chains in sync, the chain for xβ€² terminates before the chain for x? That would happen, for example, if 2 in the figure was a terminal node, without a preimage fβˆ’1(2). But then 1<2 because 1 belongs to f(X) and 2 belongs to Yβˆ–f(X). So g(1)=a precedes g(2)=c.

Clearly this reasoning holds for any x∈X1 and xβ€²βˆˆX2βŠ”X3.

It remains to show that h(x)=f(x) precedes h(xβ€²)=gβˆ’1(xβ€²). Once again we can follow the two chains in sync until we hit a terminal node for one of them. Because f and g are order-preserving, we are reduced to proving two things:

  • If x∈Xβˆ–g(Y) and xβ€²βˆˆX2βŠ”X3, then f(x)<gβˆ’1(xβ€²).
  • If y∈Y1 and yβ€²βˆˆYβˆ–f(X), then y<yβ€².

In both cases, we can β€œascend the ladder”. That is, each bullet says that an f-edge has its Y-vertex above the Y-vertex of a gβˆ’1-edge. In the figure, the f-edges slope upwards and are blue, while the gβˆ’1-edges slope downward and are red. Applying f⚬g to the Y-vertex and g⚬f to the X-vertex moves us up β€œone rung”. So the general case, where x is any element of X1 and xβ€² is any element of X2βŠ”X3, follows from these two bullets.

So let’s prove the first bullet. The argument (a proof by contradiction) is a bit tedious. This figure will make it easier to follow:

Suppose gβˆ’1(xβ€²)<f(x). Since f maps X onto an initial segment of Y, this means that gβˆ’1(xβ€²) also belongs to f(X). Say gβˆ’1(xβ€²)=f(xβ€³). Since f is order-preserving, and f(xβ€³)<f(x), we have xβ€³<x. Now, xβ€²βˆˆX2βŠ”X3 and xβ€³ belongs to the xβ€²-chain, so xβ€³βˆˆX2βŠ”X3 and is therefore in g(Y). But g maps Y onto a final segment of X, and xβ€³<x, so x must also belong to g(Y). This contradicts the assumption that x∈Xβˆ–g(Y).

The second bullet is a breeze. As noted already, all elements of f(X) precede all elements of Yβˆ–f(X). We’ve also seen that all elements of Y1 belong to f(X). QED

Rosenstein (pp.22–23) provides another proof, patterned after Dedekind’s proof of the Equivalence Theorem.

In recursive function theory, the Myhill Isomorphism Theorem is analogous to the Equivalence Theorem.

Prev TOC Next

Leave a comment

Filed under History, Set Theory

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.