**JB:** So, you were going to tell me a bit how questions about the universe of sets cast their shadows down on the world of Peano arithmetic.

**MW: **Yup. There are few ways to approach this. Mainly I want to get to the Paris-Harrington theorem, which Enayat name-checks.

First though I should do some table setting of my own. There’s a really succinct way to compare ZF with PA: PA = ZF − infinity!

Here’s what I mean. Remove the axiom of infinity from ZF. The minimal model of this is *V*_{ω}, the set of all *hereditarily finite* sets. A set is hereditarily finite if it’s finite, and all of its elements are finite sets, and all of *their* elements are finite sets, and so on down.

(Actually I was sloppy—I should have said, “replacing the axiom of infinity with its negation”. I thank Sridhar Ramesh for pointing this out.)

To show that ZF − infinity (let’s call it ZF¬∞) is “basically the same” as PA, you have to code things in both directions. We know how the integers are coded: von Neumann’s finite ordinals. (Logicians often just say “integers” when they really mean “non-negative integers”; does this drive number theorists crazy?)

**JB:** If any number theorists actually talked to logicians, it might.

(Just kidding: there are actually lots of cool interactions between number theory and logic, with ideas flowing both ways.)

**MW:** In the reverse direction, suppose we’ve already coded the all the elements of {*a*_{1},…,*a _{n}*} as integers {

*i*

_{1},…,

*i*}. We just need a way to code finite sets of integers as single integers. Lots of choices here, say bitstrings, or using a product of primes:

_{n}{*i*_{1},…,*i _{n}*} ↔

*p*

_{i1}…

*p*

_{in}(Here *p _{i}* is the

*i*-th prime, naturally.)

Coding between and *V*_{ω} is just the appetizer. Next comes translating the formal statements between L(PA) (the language of PA) and L(ZF) (the language of ZF). This calls for another trick—something called Gödel’s lemma, which uses the Chinese remainder theorem in a clever way. (Turns out you need this just to show that the “*i*-th prime” function can be expressed in L(PA).) And after that the main course: showing that PA can prove all the (translated) axioms of ZF¬∞, and vice versa. Kaye devotes a chapter to spelling out *some* of the details, off-loading the rest to exercises.

The point of all this? Just that you can do finite combinatorics in *V*_{ω} without much gnashing of teeth. The Paris-Harrington principle is a variant of the finite Ramsey theorem; the Paris-Harrington *theorem* says that the Paris-Harrington principle is unprovable in PA (assuming, of course, that PA is consistent!) People like to call this the first “natural” unprovable statement—“natural” because combinatorialists might give a darn, not just logicians.

**JB: **I understand Goodstein’s theorem a million times better than the Paris-Harrington principle, and I can easily see how the natural proof uses induction up to ε_{0}, though I don’t know the Kirby-Paris proof that it’s unprovable in PA. I thought this came before the Paris-Harrington theorem?

**MW: **Goodstein proved his theorem way back in 1944. Kirby and Paris proved its unprovability (in PA) in 1982, in the paper that also introduced the Hydra game. Although I don’t know that you can rely on publication dates for who knew what when. Anyway, Goodstein, Hydra, and the Paris-Harrington principle hang out together like the Three Musketeers. Or the Four Horsemen of Unprovability, if you include Gentzen’s result about induction up to ε_{0}, and how it’s the precise “proof strength” of PA.

Jan van Plato wrote a historical article with the delightful title Gödel, Gentzen, Goodstein: The Magic Sound of a G-String. He says:

Goodstein had no proof of the independence of his theorem from PA, but it is clear to a careful reader of his paper that he indeed was convinced of the said independence. … part of the reason for the early neglect of Goodstein’s theorem lies in

a false modesty. … The same attitude is well displayed by Goodstein’s subservient acceptance of each and every comment and criticism of Bernays. The latter persuaded him to suppress the claims to independence from the final version, and one can only speculate what effect a clear-cut conjecture of independence could have had on future research.

Getting back to the Paris-Harrington theorem: the original proof used non-standard models of PA. (Kanamori and McAloon soon simplified it; the treatment in the recent book by Katz and Reimann is particularly easy to follow.) That’s the one I want to look at here.

But since you had such fun with ordinals here (and here and here), I better add that Ketonen and Solovay later gave a proof based on the ε_{0} stuff and the hierarchy of fast-growing functions. (The variation due to Loebl and Nešetřil is nice and short.) We should talk about this sometime! I wish I understood all the connections better. (Stillwell’s Roads to Infinity offers a nice entry point, though he does like to gloss over details.)

I figured you must have written about Ramsey’s theorem at some point, but I couldn’t find anything.

**JB:** No, I’ve never really understood Ramsey theory. More precisely, I never understood its appeal. So whenever I try to read about it, I become bored and quit. “In any party of six people either at least three of them are mutual strangers or at least three of them are mutual acquaintances.” I’m sure there’s *something* interesting about this. I just haven’t gotten interested in it.

**MW:**But you do know, right, it comes in finite and infinite flavors?

**JB:** Yes.

**MW:** Ramsey was doing logic when he discovered his most famous result: he solved a special case of Hilbert’s Entscheidungsproblem.

You can think of Ramsey’s theorem as a supercharged pigeonhole principle. If you color an infinite set of points with a finite number of colors, at least one color paints an infinite number of points. OK, now color instead the edges of the complete graph on an infinity of points. Then there’s an infinite subset of the points, such that the complete graph on that subset is *monochromatic*. That’s not so obvious! Next step: again start with an infinite set of points, but this time color all the triangles (unordered triples of points). Then there’s an infinite subset of points with all its triangles the same color. And so on.

For the finite version, we start with a large finite set of points, and look for a subset of a given size such that all its points, or edges, or triangles (etc.) have the same color. You’ll find it if the original set of points is *sufficiently large*. How large equals “sufficiently”? That depends on (a) the cardinality of the things you’re coloring (points, edges, triangles, etc.); (b) the number of colors; and (c) how big you want your monochromatic set to be.

Now here’s an interesting fact. The finite version of Ramsey’s theorem follows from the infinite one by a routine compactness argument. (Or you can use Kőnig’s infinity lemma.) Ramsey himself gave a direct inductive proof for both versions—so the finite version is a theorem of PA. Paris and Harrington made just a *small tweak* to the finite version. The compactness argument barely notices the change—the tweaked version (the Paris-Harrington principle) still follows from the infinite Ramsey theorem. But the direct inductive proof comes to a crashing halt!

ZF can prove the Paris-Harrington principle, an assertion purely about *V*_{ω}, but ZF¬∞ can’t! Does this count as the infinite casting its shadow on the finite?

(Hmmm, I’ve spent so much time table-setting that now it’s time to walk the dog. We’ll have to get to the (model-based) proof of the Paris-Harrington theorem in the next post.)

**Addendum:** I thought Ketland’s comment deserved some discussion. First, thanks for the reference!

The folklore result I had in mind was “derivability equivalence”. This is a corollary to Prop. 2.1 in the Kaye-Wong paper. Say *f*:*S*→*T* is an interpretation of theory *S* into theory *T*, and write σ^{f} for what we get by translating a formula σ of *S*-language into *T*-language. Suppose *S*⊢σ. Then *T*⊢σ^{f} . If we also have an interpretation of *T* into *S*, then *S* and *T* are equi-consistent. That’s the situation with PA and ZF¬∞.

Kaye and Wong are after something stronger. Ackermann had provided inverse maps between and *V*_{ω}; Kaye and Wong leverage these to get interpretations that are truly inverse: (σ^{f})^{g} = σ, (τ^{g})^{f} = τ. You don’t get that with the interpretations I sketched—if you code von Neumann’s 3 into , the result isn’t 3! However, you have to replace ZF¬∞ with ZF¬∞+TC to get inverse interpretations, where TC is what they call the axiom of transitive containment (equivalent to ∈-induction).

Skimming the paper gave me a half-baked idea. It looks like Kaye and Wong are constructing an isomorphism between two categories. I wonder if the “folklore” interpretations give a natural equivalence?

The paper says, “when the details were finally uncovered, there were surprises for both authors”. (Interesting, because Kaye outlined some similar ideas in exercise 11.4 of his book.) For me, the need for the axiom of transitive containment was unexpected, although in hindsight it makes perfect sense. When describing *hereditarily finite*, I said, “and all of *their* elements are finite sets, and so on down.” That’s just a folksy way of saying “the transitive closure is finite”. If we can’t ask ω to do some heavy lifting for us, showing that there *is* a transitive closure doesn’t seem to be possible with just ZF¬∞.

[This post and the previous one acquired a long comment thread, mainly between Bruce Smith and John Baez. It was difficult to follow in that format, so I have converted it to a separate webpage. Topics: (a) Why do standard models of ZF have standard ω’s? (b) Interactions between the Axiom of Infinity Axiom and the Foundation Axiom (aka Regularity). (c) The compactness theorem. (d) The correspondence between PA and “ZF with infinity negated”: nonstandard numbers vs. ill-founded sets, and the Kaye-Wong paper. –*MW*]

Advertising this post on Twitter, I wrote:

and Sridhar Ramesh replied:

and then:

In logic, one says theories A and B (in disjoint signatures) are definitionally equivalent if there are definitional extensions A+ and B+, which are logically equivalent. As it turns out, this definitional equivalence relationship is intimately connected to the invertibility of the translation maps between A and B. So theorems of A can be translated to theorems of B and vice versa and the translation functions mutually invert each other.

For “finite ZF” and PA, the detailed definitional equivalence was only recently carefully worked out, though it’s a folklore result: it involves the (1937) Ackermann encoding of sets as numbers. So a binary predicate intuitively meaning “n is an element of m” can be defined inside PA (ultimately, just using 0,S,+ and x; but in fact it needs exponentiation), and it behaves just like the membership predicate on finite sets. But, like much in arithmetic involving coding, the details are messy!

The equivalence is between these two theories

(i) PA

(ii) ZF – axiom of infinity + ~(axiom of infinity) + the axiom of “transitive containment”

The details appeared in

Richard Kaye & Tin Lok Wong, 2007 “On Interpretations of Arithmetic and Set Theory”, Notre Dame Journal of Formal Logic.

(URL: https://projecteuclid.org/euclid.ndjfl/1193667707)

Thanks a lot for the clarification! Now I’m curious about whether Michael was alluding to this result or some earlier “folklore”.

Thanks for the reference! I’ve added an addendum to the post.

“Skimming the paper gave me a half-baked idea. It looks like Kaye and Wong are constructing an isomorphism between two categories. I wonder if the “folklore” interpretations give a natural equivalence?”

I suspect that’s right. The functors here are something like natural transformations between categories (though my knowledge of category theory isn’t up to much). For category-theoretic takes on the topic of interpretations which might clarify this, there’s

Visser 2004, “Categories of Theories and Interpretations”

Friedman & Visser 2014, “When Bi-Interpretability Implies Synonymy”

(I think the first is published somewhere; but you’ll find both using Google on Visser’s preprint page.) Mutual invertibility of the interpretations/translations is important if you want exact definitional equivalence, as the Kaye/Wong paper makes clear. My colleague Ali Enayat adds some more information about the PA and ZFfin + TC relationship here, as well as a lot of other interesting stuff (also about: when does bi-interpretability imply definitional equivalence?)

Enayat 2018, “Variations on a Visserian Theme”

But Ali’s paper is all done using standard math logic, and no category theory!

Pingback: Nonstandard Models of Arithmetic | Azimuth

About “ZF¬∞” and PA being basically the same — can I assume they each have “nonstandard models” and those also work in equivalent ways? Is it as “simple” as saying “in a nonstandard model of ZF¬∞, there might be a set X whose finite cardinality is a nonstandard number in PA”?

Worse (or at least weirder), could this affect the number of levels of depth in hereditary finiteness, or can I continue to rely on

thatbeing standard? (I’m guessing this too can become nonstandard, but I don’t have a confident guess.)Yes, that’s right. (Especially because you put “simple” in scare-quotes.)

The number of levels indeed can become nonstandard. One direction of the equivalence uses ω as the model of PA (after defining the operations). So if the number of levels is always standard, then the model of PA you get is standard. In the other direction, finite sets are represented by finite bitstrings. So if you start with the standard model of PA, you end up with the standard model of ZF¬∞.

(Well, now I’m pretty sure of that guess, since it looks like you could just code up a function in PA which was the maximum number of levels of depth used in (the numeric encoding of) a “set”.)

Yup, right.

A long comment thread, mainly between Bruce Smith and John Baez, has been converted to a separate webpage.