## Review: Smullyan & Fitting, Set Theory and the Continuum Problem

The first sentence of Pollard’s review sums up my feelings perfectly: “This rewarding, exasperating book…” On balance, I found it more exasperating than rewarding. But it does have its charms.

I participated in a meetup group that went through the first two parts of S&F. My fellow participants possessed considerable mathematical knowledge and sophistication, but had only slight prior acquaintance with mathematical logic and none with axiomatic set theory. (The opinions here are strictly my own, but they reflect my experience in the meetup.) If I had just skimmed the book, glancing at familiar material, I would probably have a more positive impression.

I wrote an extensive set of notes for the meetup. This post is basically the last section of those notes.

I will begin with the book’s minuses, so as to end on a positive note.

Sloppiness

A major drawback for self-study. Some of the errors are clearly just typos, but many are not so easily fixed. We frequently found mistakes not listed in any of the online lists of errata (themselves not short for a book of this size), or mentioned in Pollard’s review.

The sloppiness goes beyond specific errors. As Pollard observes, the authors don’t want to take ‘=’ as a primitive relation, but they never define it; each of the two standard definitions is inconsistent with the text at one point or another. Another example: their treatment of classes. Some authors opt for ZF, others for NBG, but S&F occupy an uneasy middle ground. It often seems that they couldn’t make up their minds. Even minor matters (like the haphazard numbering of the axioms) betray a pervasive inattention to detail.

Logic Background

S&F fall prey to the seductive notion that you can understand one specific first-order theory, ZF (or NBG), without first knowing what the phrase “a first-order theory” means. They say in the preface:

Our book is intended as a text for advanced undergraduates and graduates in mathematics and philosophy. It is close to self-contained, involving only a basic familiarity with the logical connectives and quantifiers. A semester course in  symbolic logic is more than enough background.

At best, this may be technically true. In practice, to appreciate the issues you need familiarity with quite a few notions from first-order logic and model theory.

S&F put off a serious discussion of first-order syntax and semantics until Ch.11 (p.141–144), in part II. Part I takes place in the land of naive set theory, with confusing forays into axiomatic territory.

This fuzziness regarding syntax and semantics spills over onto content; it’s not just style. “Axioms” A1 and A2 (p.17) prove especially troublesome. My notes explain why, and give my best guess at their intended interpretation; I relied on uses occurring 85 pages later, and a concept that first appears (implicitly) over 130 pages later, in Ch.12. But a reader shouldn’t have to engage in such detective work. All this fogginess wafts up from the authors’ refusal to be explicit (until much later) about the syntax and semantics of first-order logic.

Sean Carroll, in his preface to Spacetime and Geometry: An Introduction to General Relativity, writes:

An intentional effort has been made to prefer the conventional over the idiosyncratic… Since the goal of the book is pedagogy rather than originality, I have often leaned heavily on other books (listed in the bibliography) when their expositions seemed perfectly sensible to me.

Would that S&F had followed this philosophy! They sometimes seem allergic to convention.

Their choice of terms manifests this in a minor way. The downward Löwenheim-Skolem Theorem becomes the Tarski-Vaught Theorem. Mostowski collapse becomes the Mostowski-Shepherdson mapping. The Gödel condensation lemma becomes Gödel’s isomorphism theorem. They may have historical justification for some of these attributions, but surely a parenthetical “(usually called the downward Löwenheim-Skolem Theorem)” wouldn’t have hurt. (Cummings’ review also complains about this.)

In the preface, S&F call out some unusual features of their treatment of ordinals and the well-ordering theorem (Chs.4 and 5). Pollard’s sardonic comment deserves to be quoted:

The authors spare reviewers the task of crafting laudatory prose to describe this chapter. It offers, they say, “a particularly smooth and intuitive development of the ordinals”. Indeed, we are guaranteed “a beautifully natural and elegant treatment”. The authors cannot help but remark, “It is high time this neat approach should be known!” Any praise this reviewer might offer would be superfluous.

I found their treatment far less clear and intuitive than the typical one. Expositional quirks bear most of the blame (see below). The S&F approach is “top-down”, the usual one “bottom-up”. Both approaches trace their ancestry to Zermelo’s proofs of the well-ordering theorem. Now, 90% of the development looks pretty much the same with either approach: throw transfinite induction (or superinduction) at everything in sight. For the remaining 10%, I see no intrinsic advantage to top-down over bottom-up. (S&F’s implemention of the top-down approach demands Cowen’s theorem. As Pollard points out (a) its use is easily avoidable (p.478), and (b) a key definition is “a disaster” (p.476), thanks to a (fixable) careless error (not just a typo). The proof of Cowen’s theorem is neither short nor sweet.)

The preface notes another special feature of their approach:

Our main novelty lies in the use of Smullyan’s double induction and double superinduction principles… It is high time that these should appear in a textbook.

I’ve found exactly one application of double superinduction in the book, and none elsewhere. This is in the proof of comparability for ordinals. Other treatments prove comparability with ordinary transfinite induction—it’s a just a touch trickier than many such arguments. You could extract the proof of double superinduction from the comparability argument. Is the game worth the candle? S&F seem to think so.

S&F’s biggest claim to originality is Part III, the treatment of forcing via S4 modal logic. As Pollard diplomatically says, “This approach will be welcomed by scholars who have struggled with forcing, but are comfortable with modal logic.” Not a large cohort, I’ll bet! Pollard continues, “Many students, however, may lack both the motivation and the background to make sense of it.” My own two cents: unless you have a special affinity for modal logic, don’t bother. Pollard has a more positive assessment. Cummings on the other hand writes,

I do not feel that the pedagogical advantages of using modal logic to present forcing outweigh the drawbacks. In the modal approach the student is called on to master a new formal language and a new semantics before even embarking on the study of forcing; this seems likely to make a hard subject even more confusing.

Slice and dice, and the pointless generalization

You can take almost any step in a train of mathematical reasoning and extract some kind of generalization from it. Just look at the precise assumptions used in that step, invent new terms as needed, more new terms for the consequences,
and voila! A new proposition! Nine times out of ten it isn’t worth the trouble.

S&F couple this sort of generalization with “slice and dice”: scrambling the natural order of the links in a chain of argument. What should be ABCDE becomes C′⇒D0AB″;  D′⇒EB1C*. (This burdens the reader with recognizing that B1 is a special case of B″, etc.)

Their proof of Zermelo’s well-ordering theorem exhibits these quirks at full strength. Ch.13 on the constructible universe also lacks a clear narrative thread. Their addiction to generalization shows up in an amusing way. In §12.3, they introduce Gödel’s notion of a constructible class. In §13.2, they generalize this to a distinguished subclass. Two pages later, in §13.3, they take pains to point out that the proof of Lemma 3.2 still works if distinguished is replaced with the yet more general special. No other application of these more general notions is ever made.

The slice and dice technique has an unfortunate corollary: a population explosion in terminology. You need to keep track of terms like progressing, strictly progressing, slowly progressing, g-tower, slow g-tower, slow well-ordering, etc.—all slight variations on the same cluster of ideas. If at each link we look for the weakest possible hypotheses to justify that implication, the thrust of the argument is lost in a profusion of terms and ever-shifting assumptions.

I do not find fault with all their generalizations. The definition of ordinal hierarchy is amply justified by its three uses: the Rα‘s in Ch.7, the Lα‘s in Ch.12, and the RGα‘s in Ch.17. The relational system concept of Ch.10, although single-use in this book, is pleasingly simple and may furnish some insight into Mostowski collapsing maps. (S&F may have missed a bet. Shoenfield used Mostowski collapsing maps (or nearly so) to construct generic extensions; very likely S&F could have used relational systems in Ch.21 for the same purpose.)

Now for the plusses.

Detailed arguments

Perhaps you’ve never had this experience: you get stuck for a day or two in the middle of reading a proof, only to realize that a casual remark two pages back supplies the missing link. Or that a slightly different argument handles the case where  A=∅.

S&F provide detailed, step-by-step proofs; except for typos and such, “stuck in the middle” syndrome seems unlikely.

Notation/prose balance

One is naturally tempted, in a book on axiomatic set theory, to go heavy on notation. The reader must deal with formal logic; why not take advantage of its precision and concision?

Except for part III, S&F strike a near-perfect balance; if anything they sometimes push the needle a touch too far toward the “prose” end of the dial. (I wish they had made use of the satisfaction symbol.) As a small example, consider how Drake states Gödel’s result for GCH in L:

ZFC⊢Init(κ)→∀x(xH(κ)∧xLxLκ)

The S&F version (p.194):

For any infinite cardinal c, every constructible subset of Lc is an element of Lc*.

The notation-heavy style demands a steady level of “decoding” from the reader, which gets tiring over time. (And Drake is a very good textbook.)

Technical felicities

Some parts of S&F are very nicely done; their treatment of absoluteness and related concepts in §12.2 and §§14.1–14.2 is particularly clear and meticulous. They employ a trick in Ch.14 (favoring Σ over Δ1ZF) that I found technically sweet. The last part of Ch.14, proving that L has a definable well-ordering, is a gem of exposition.

Conversational style

In the bad old days, Edmund Landau’s Foundations of Analysis was regarded as the epitome of mathematical style: Definition-Theorem-Proof, with none of that frilly motivation and hand-holding. The famous Bourbaki series solidified this sort of exposition.

The past few decades have seen a welcome trend away from those chilly tomes, but not all authors can wave their hands with aplomb. Fortunately our authors can. The very best sections of S&F are the chatty ones.

I’ve quoted Pollard’s review several times already. His view of S&F’s first chapter seems a little jaundiced:

The opening chapter provides a happy-go-lucky introduction to size comparisons between infinite sets. The use of the first person singular helps to warm an atmosphere already sweetened with cozy good humor
(perplexing though the ‘I’ might be in a work with two authors). If this chapter is meant to entice and entertain readers rather than to instruct them, then it succeeds admirably.

In fact, the opening chapter covers pretty standard material: countability of $\mathbb{N}\times\mathbb{N}$, of $\mathbb{Q}$, of the set of all finite subsets of $\mathbb{N}$, and the uncountability of $P(\mathbb{N})$. S&F present this by way of an engaging metaphor; this would not be out of place in a pop math book. The chapter also covers Russell’s paradox, and economically sketches the transition from Frege’s (inconsistent) set theory to Zermelo’s axioms. Perhaps Pollard was put off by the elementary nature of this chapter; most texts on the continuum hypothesis get off to a faster start.

Nearly every chapter sets the stage with a paragraph or two of well-chosen words. Chapter 5 introduces ordinals with suitable fanfare. The opening section of part III categorizes the approaches to forcing as “non-classical logic” vs. “non-inner model” with some hand-waving of the first order. When S&F do turn to non-inner models, they don’t just plunge into the machinery of dense and generic sets over a partial order, but motivate it via Cohen’s concrete construction of a complete sequence. Finally, the closing chapter presents a brief sketch of the history of forcing, beautifully done. (Though I do disagree with one of their historical assertions—see my notes.)

Beyond these specific passages, the desire to be reader-friendly pervades the book. Better to have that goal and sometimes fall short, than not to have it at all.

Filed under Logic, Reviews

## Stirling’s Formula: Ahlfors’ Derivation

If you’re reading this blog, you probably know Stirling’s formula:

$n! \sim \sqrt{2\pi n}\left(\frac{n}{e}\right)^n$

It’s not hard to estimate n! to within a factor of √2; I wrote up a note on this and even easier derivations. It’s quite a bit harder to show that the ratio of the two sides approaches a definite limit as n→∞ and that this limit is 1. You can find a variety of proofs in a number of places, one being Ahfors’ Complex Analysis.  I wrote up a note about this too, expanding on some of the details.

Incidentally, the two sides are asymptotic not just for positive integers n. Replace n! with Γ(z+1) on the left, and both n‘s with z‘s on the right. Allow z to go to infinity in the complex plane, while staying at least a fixed finite distance to the right of the imaginary axis. Then the two sides remain asymptotic. Ahfors proves this stronger result, and uses it to derive the integral form for the Γ function.

Note that if you replace the n‘s with z‘s, you have zz on the right. So you’ve got to worry about branches of the complex logarithm (since zz is defined as ez log z). The note deals with this (and other things).

1 Comment

Filed under Analysis

## First-Order Categorical Logic 4

MW: I made up a little chart to help me keep all these adjoints straight:

Filed under Categories, Conversations, Logic

## First-Order Categorical Logic 3

JB: Okay, let’s talk more about how to do first-order classical logic using some category theory. We’ve already got the scaffolding set up: we’re looking at functors

$B \colon \textrm{FinSet} \to \textrm{BoolAlg}.$

You can think of $B(S)$ as a set of predicates whose free variables are chosen from the set S. The fact that B is a functor captures our ability to substitute variables, or in other words rename them.

But now we want to get existential and universal quantifiers into the game. And we do this using a great idea of Lawvere: quantifiers are adjoints to substitution.

Filed under Categories, Conversations, Logic

## Non-standard Models of Arithmetic 11

MW: Time to start on Enayat’s paper in earnest. First let’s review his notation. M is a model of T, a recursively axiomatizable extension of ZF. He writes $\mathbb{N}^M$ for the ω of M equipped with addition and multiplication, defined in the usual way as operations on finite ordinals. So $\mathbb{N}^M$ is what he calls a T-standard model of PA.

1 Comment

Filed under Conversations, Peano arithmetic

## Non-Standard Models of Arithmetic 10

(MW: I have converted the first few posts into pdf files, formatted both for a small screen screen and a medium-sized one.)

JB: So, last time you sketched the proof of the Paris–Harrington theorem. Your description is packed with interesting ideas, which will take me a long time to absorb. Someday I should ask some questions about them. But for now I’d like to revert to an earlier theme: how questions about the universe of sets cast their shadows down on the world of Peano arithmetic.

1 Comment

Filed under Conversations, Peano arithmetic

## First-Order Categorical Logic 2

MW: So let’s see. Last time we talked about the functor B from the category FinSet to the category BoolAlg of boolean algebras. Liberal infusions of coffee convinced you that B is covariant; I accidentally suggested it was contravariant. I think I’ve come round to your position, but I still have a couple of things I want to say on the matter. If it won’t be too confusing for our readers.

JB: Okay.  If we’re planning to talk more about the variance, it’s probably good to start out by getting the reader a bit confused.  I used to always be confused about it myself.  Then I finally felt I had it all straightened out.  Then you shocked me by arguing that it worked the opposite way.  Your argument was very sneaky.