Set Theory Jottings 11. Zermelo to the Rescue! (Part 2)

Prev TOC Next

In 1908 Zermelo published his paper “Investigations in the foundations of set theory”. This contained the axiom system that eventually led to ZFC. Zermelo opens the paper with this rationale:

Set theory is that branch of mathematics whose task is to investigate mathematically the fundamental notions “number”, “order”, and “function” … At present, however, the very existence of this discipline seems to be threatened by certain contradictions, or “antinomies” [such as the Russell paradox]. …it no longer seems admissible today to assign to an arbitrary logically definable notion a “set”, or “class”, as its “extension”. Cantor’s original definition of a “set” as “a collection, gathered into a whole, of certain well-distinguished objects of our perception or our thought” therefore certainly requires some restriction … Under these circumstances there is at this point nothing left for us to do but to proceed in the opposite direction and, starting from “set theory” as it is historically given, to seek out the principles required for establishing the foundations of this mathematical discipline. In solving the problem we must, on the one hand, restrict these principles sufficiently to exclude all contradictions and, on the other, take them sufficiently wide to retain all that is valuable in this theory.1

Zermelo’s motivation is pragmatic, unlike the philosophical approach of Russell and Whitehead’s Principia Mathematica.

Axiomatization was “in the air” at this time, with people throwing out various suggestions. Moore (p.151) offers some examples. Julius König proposed two axioms: (1) There are mental processes satisfying the formal laws of logic. (2) The continuum ℝ, treated as the totality of all sequences of natural numbers, does not lead to a contradiction. Schoenflies opted for the Trichotomy of Cardinals, and wanted to hold on to the Principle of Comprehension. Cantor sent a letter to Hilbert with some principles that look a lot like Zermelo’s axioms, but this letter didn’t come to light until decades later.

Zermelo’s article stands out as the first published proposal with a full set of axioms, demonstrating that it could save some of Cantor’s Paradise, and recognizing that the Principle of Comprehension was kaput.

Zermelo’s eschews philosophy:

The further, more philosophical, question about the origin of these principles and the extent to which they are valid will not be discussed here. I have not yet even been able to prove rigorously that my axioms are “consistent”, though this is certainly very essential…

The paper, he says, develops the theory of equivalence in a manner “that avoids the formal use of cardinal numbers.” He promises a second part, dealing with well-ordering, but this never appeared.

After the introduction, Zermelo begins:

  1. Set theory is concerned with a “domain” 𝔅 of individuals, which we shall call simply “objects” and among which are the “sets”. …
  2. Certain “fundamental relations” of the form aεb obtain between the objects of the domain 𝔅. …An object b may be called a set if and—with a single exception (Axiom II)—only if it contains another object, a, as an element. [I will use the modern ∈ in place of Zermelo’s ε from now on.]
  3. [Definition of subset and disjoint]
  4. A question or assertion 𝔈 is said to be “definite” if the fundamental relations of the domain, by means of the axioms and the universally valid laws of logic, determine without arbitrariness whether it holds or not. Likewise a “propositional function” 𝔈(x), in which the variable term x ranges over all individuals of a class 𝔎, is said to be “definite” if it is definite for each single individual x of the class 𝔎. Thus the question whether ab or not is always definite, as is the question whether MN or not.

Note that item (1) allows for so-called urelements or atoms—things like integers. ZFC is a so-called “pure” set theory, without atoms.

Next come seven axioms, interlarded with extensive discussion.

Extensionality:
“Every set is determined by its elements.” In other words, if MN and NM, then M=N.
Elementary Sets:
The null set exists. Given any elements a and b of the domain, the sets {a} and {a,b} exist.
Separation:
To quote Zermelo: “Whenever the propositional function 𝔈(x) is definite for all elements of a set M, M possesses a subset M𝔈 containing as elements precisely those elements x of M for which 𝔈(x) is true.”

Zermelo notes that “sets may never be independently defined by means of this axiom but must always be separated as subsets from sets already given”, and that this prevents the Russell paradox and the like. Indeed, the Russell paradox is turned into a theorem: for any set M there is a subset M0 such that M0M. He also notes that “definiteness” precludes some semantic paradoxes, e.g., Richard’s paradox (see post 5).

Zermelo shows that Separation implies the existence of set differences MM1 (denoted MM1) and intersections MN (denoted [M,N]), and even ⋂XTX for a set of sets (which he denotes 𝔇T, for “Durchschnitt”).

Power Set:
For any set T, there is a set whose elements are precisely all of T’s subsets. He denotes the power set of T by 𝔘T (for “Untermengen”).
Union:
For any set T, there is a set whose elements are precisely the elements of the elements of T. In modern notation, ⋃XTX. Denoted 𝔖T, for “Summe”. He writes M+N for our MN.
Choice:
Given any set T of mutually disjoint nonempty sets, the union ⋃XTX contains a subset S such that SX is a singleton for each XT.

Zermelo adds, “We can also express this axiom by saying that it is always possible to choose a single element from each element M,N,R,… of T and to combine all the chosen elements, m,n,r,…, into a set” S.

The set of all S’s satisfying this condition (card(SX)=1 for all XT) Zermelo calls the product of the elements of T, denoted 𝔓T, or just MN for a pair of disjoint sets M and N.

Infinity:
There is a set Z containing the null set 0, and for each of its elements a, it also contains {a}.

This leads to the so-called “Zermelo finite ordinals”, 0, {0}, {{0}}, {{{0}}}, etc. Z contains all these, and using Separation, we can assume Z contains exactly these. The Zermelo finite ordinals have two drawbacks: (1) They don’t extend naturally into the infinite ordinals; (2) Each of them, except 0, contains exactly one element. The von Neumann ordinals removed both of these blemishes.

The rest of the paper develops the theory of equivalence from the axioms. I noted that Zermelo allows atoms. On the other hand, he does not have ordered pairs, and thus neither relations nor functions. This lack calls for some gymnastics. When M and N are disjoint, the set of all unordered pairs MN={{m,n}:mM, nN} substitutes for our M×N.

To define equivalence between sets M and N, he assumes first that M and N are disjoint. Using MN instead of M×N, he can define “bijection between M and N’’. If one exists, then M and N are “immediately equivalent”. Dropping the disjointness condition, he says M and N are “mediately equivalent” if there exists a third set that is disjoint from both and “immediately equivalent” to both. It takes a couple of pages to show that this definition makes sense.

Zermelo proves the Equivalence Theorem, that is the “(Cantor-Dedekind-Schröder)-Bernstein Theorem”. (A couple of decades later, he discovered that Dedekind had basically the same proof.) He gives detailed proofs of the basic facts about equivalence. He defines “M has lower cardinality than N’’ in the usual fashion (M injects into N but not vice versa) but avoids defining “cardinal number”, as he promised in the introduction. The paper crescendoes in a proof of J. König’s inequality, a generalization of Cantor’s 𝔪<2𝔪. Expressed using cardinal numbers, this says that if 𝔪k<𝔫k for all k in some index set K, then ∑k𝔪k<∏k𝔫k. Zermelo, of course, phrases this without mentioning cardinal numbers.

Zermelo spills a fair amount of ink on the question of “definiteness”. He initially claims that ab and MN are definite questions, as we’ve seen. When defining 𝔇T, he notes that for any object a, the set Ta={XT: aX} exists by Separation (because aX is definite). But the question whether Ta=T is also definite. So using Separation again, 𝔇T={aA: Ta=T}, where A is any element of T. A similar discussion accompanies his definition of “immediately equivalent” showing that it is definite whether a given subset of MN defines a bijection.

Nonetheless, a certain nimbus of indefiniteness surrounds Zermelo’s “definite”. Twenty-one years later, Zermelo published a paper, “On the concept of definiteness in axiomatics”. By this time, people had suggested replacing “definite” with “definable in first-order logic”. Zermelo did not accept this, and his proposal had no influence on ZFC.

[1] Despite this clear statement from Zermelo, Moore argues that “his axiomatization was primarily motivated by a desire to secure his demonstration of the Well-Ordering Theorem and, in particular, to save his Axiom of Choice” (Moore (p.159)). He notes that Zermelo composed the two 1908 papers—the axiomatization, and the second well-ordering proof—together, and that “there are numerous internal links connecting the two papers” (Moore). Zermelo’s biographer takes an intermediate view: “Above all, however, one has to take into consideration how deeply Zermelo’s axiomatic work was entwined with Hilbert’s programme of axiomatization and the influence of the programme’s ‘philosophical turn’ which was triggered by the publication of the paradoxes in 1903” (Ebbinghaus (p.81)).

Prev TOC Next

8 Comments

Filed under History, Set Theory

8 responses to “Set Theory Jottings 11. Zermelo to the Rescue! (Part 2)

  1. Toby Bartels's avatar Toby Bartels

    This history is interesting.

    I was aware that Zermelo’s version of the Axiom of Choice required the sets to be disjoint, perhaps making the axiom more plausible without actually weakening it. And I was aware that the Axiom of Choice is equivalent to the statement that a cartesian product of inhabited sets is inhabited. But I did not know that Zermelo viewed his axiom as stating that a product of inhabited (or equivalently non-empty) sets is inhabited, with the strange restriction that one can only take a product of disjoint sets.

    But it’s neatly parallel to the restriction that one can only take a sum (or so-called disjoint union) of sets as the ordinary union if they are disjoint. In both cases, we can remove this restriction, at the cost of complicating the definitions of the operations, using ordered pairs.

    • Yes, I think the lack of ordered pairs explains most of the odd features of Zermelo’s system.

      It’s curious that he didn’t incorporate them. His system is not a pure set theory—he allows urelements.

      • Toby Bartels's avatar Toby Bartels

        But then he would have had hypothesize the existence of ordered pairs, which would be an ontological complication (although Bourbaki did it). According to https://blog.plover.com/math/wiener-pairs.html, the first construction of ordered pairs within pure set theory was in 1911 (by Norbert Wiener), so he almost could have done it, or maybe thought of it himself if he’d found it necessary. But he didn’t.

  2. I’m trying to understand this excerpt: “a “propositional function” 𝔈(x), in which the variable term x ranges over all individuals of a class 𝔎, is said to be “definite” if it is definite for each single individual x of the class 𝔎. “

    How would we think about this enough to form an opinion about it, for a large enough infinite set, given that most of its elements can’t be uniquely described?

    Would we just say something like “suppose x is in 𝔎, then I can see how to evaluate 𝔈(x) in principle using basic facts about x”? Which facts? Its membership relations with … everything else? This seems pretty vague, unless we already have some concept of successive construction of every object in successive layers, which it sounds pretty clear was not part of this system.

    • Item 4 in the post is a direct quote from Zermelo’s paper (as are items 1–3). The other main passage in the paper about “definite” comes right after Axiom III (Separation):

      By giving us a large measure of freedom in defining new sets, Axiom III in a sense furnishes a substitute for the general definition of set that was cited in the introduction and rejected as untenable. It differs from that definition in that it contains the following restrictions. In the first place, sets may never be independently defined by means of this axiom but must always be separated as subsets from sets already given; thus contradictory notions such as “the set of all sets” or “the set of all ordinal numbers”, and with them the “ultrafinite paradoxes”, to use Mr. G. Hessenberg’s expression (Grundbegriffe der Mengenlehre, chap. 24), are excluded. In the second place, moreover, the defining criterion must always be “definite” in the sense of our definition in No. 4 (that is, for each single element x of M the “fundamental relations of the domain” must determine whether it holds or not), with the result that, from our point of view, all criteria such as “definable by means of a finite number of words”, hence the “Richard antinomy” and the “paradox of finite denotation” (Hessenberg op. cit., chap. 23; on the other hand, see J. König, Math. Ann. vol.61, p.156) vanish. But it also follows that we must, prior to each application of our Axiom III, prove the criterion 𝔈(x) in question to be “definite”, if we wish to be rigorous; in the considerations developed below this will indeed be proved whenever it is not altogether evident.

      As for the promised proofs of definiteness, these occur scattered throughout the paper. I summarized the one concerning 𝔇T. Here’s another, quoting directly:

      Introduction of the product. If M is a set different from 0 and a is any one of its elements, then according to No. 5 it is definite whether M = {a} or not. It is therefore always definite whether a given set consists of a single element or not.

      Zermelo’s notion of “definite” was clearly meant as a defence against Richard’s paradox and the like. In solving one problem, he introduced another. Contemporaries found this as vague and unsatisfactory as you do. Skolem’s proposal, to use “definable in the first order predicate calculus” instead, carried the day. But Zermelo never accepted it, partly because of Skolem’s paradox: the existence of a countable model of the axioms.

      • Thanks for those extra quotes — they make his intent at least a little clearer to me. I didn’t mean to imply that I “find this concept unsatisfactory”, only that I’m trying to understand what he really meant by it. I am so used to Skolem’s proposal that it may be impossible to imagine myself in the place of someone for whom that was not standard.

        Did Zermelo hold out some hope for preventing the existence of a countable model (say for some set-theory-based axioms of the reals, or of set theory itself)? Did he perhaps not buy into avoiding second-order logic? (I don’t know whether those questions might be related.)

      • Toby Bartels's avatar Toby Bartels

        I’ve often heard it said that Zermelo’s notion of definite property was second-order, although that seems anachronistic. I wonder if Zermelo gave an example of a property that the first-order interpretation doesn’t allow but that he thought should be.

  3. Bruce and Toby—

    Your comments have stimulated me to write a short post about Zermelo’s “definiteness” paper. Coming soon…

Leave a reply to Bruce Smith Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.