# A Comment Thread for NSA-7 and NSA-8

TOC

[MW: Posts 7 and 8 developed an extensive comment thread, mainly between Bruce Smith and John Baez. It was a bit hard to follow in that format, so I have converted it to this page, with some light editing. Topics: (a) Why do standard models of ZF have standard ω’s? (b) Interactions between the Infinity Axiom and the Foundation Axiom (aka Regularity). (c) The compactness theorem. (d) The correspondence between PA and “ZF with infinity negated”: nonstandard numbers vs. ill-founded sets, and the Kaye-Wong paper.

Bruce Smith: In post 7, you (MW) say: “We get the same ω no matter which T and which M, so long as M is a standard transitive model.”

Did you really mean “no matter which T and which M“? To me it seems like ω does depend on T (in fact this is the whole point of defining PAT), though (given your argument) not on M.

MW: If M is a standard transitive model of ZF, then its ω is the standard ω, regardless of T. (Recall that T is an extension of ZF.) Post 18 and post 19 have more about this. Incidentally, John and I are now using ‘U’ instead of Enayat’s ‘M’ for the model of ZF, which in turn lives inside an “outermost” model V of ZF. (You can think of V as “the real universe”, if that phrase doesn’t make you want to pull your hair out—it does for some people.) It’s with respect to this V that U (aka M) is standard and transitive, and U’s ω is the same as the standard one (i.e., V’s ω).

So standard transitive models of ZF have rather boring omegas. Even if we drop ‘transitive’ from our requirements, they’re still boring (merely isomorphic to the standard omega, not necessarily equal to it). Enayat’s interest are the omegas of nonstandard models of ZF; ‘nonstandard’ here means that the element-of relation in U differs from the element-of relation in V. That’s the context in which the choice of T matters.

BS: Thanks, that is very interesting. I don’t yet fully see why it’s true, but I’ll read Part 19 to find out! (And maybe look more into Enayat’s paper itself, which seems very well written and not too long.)

(BTW, you can mention any bigger universes you want—this will only reassure me that you realize that whatever you are thinking about now, is not all there is! If you call them “the real” something, well… having read your other comments on all that, I’m still not too worried. : )

JB: Let me try to see why “standard models of ZF have boring omegas”.

So, say we have a model (U,ε) of ZF in some universe (V,∈). We say this model is “standard” if the ε is the relation ∈ restricted to U.

There’s no need for the empty set of U to match the empty set of V: there’s just some unique set in U that contains no other sets in U. Let’s call it 0.

But since the membership relation ε in U matches the membership relation ∈ in V, the finite ordinals in U go like this:

0
{0}
{0, {0}}
{0, {0}, {0, {0}}}

where the notation for sets is unambiguous: it means the same thing in U as it does as in V.

Thus the omega of our model U looks like this:

{0, {0}, {0, {0}}, {0, {0}, {0, {0}}}, …}

while the omega in V looks like this:

{∅ , {∅}, {∅,{∅}}, {∅, {∅}, {∅, {∅}}} … }

So, they are isomorphic!

Well, that was very sketchy. Too sketchy. And it looks suspiciously easy—you can imagine using it to prove that the omega in any model U is isomorphic to that of V.

We need to make the argument more formal to see what’s really going on. It requires induction, clearly. But I think we also need the fact that ε in U matches the membership relation ∈. We are then in “familiar territory” as far as set theory goes, and the “obvious argument” that lets us inductively build up a bijection between the omega in U and the omega in V actually works without a hitch.

I guess it would produce more insight to let (U,ε) be a nonstandard model of ZF in some universe (V,∈), and show how this argument breaks down – either by locating a gap in the argument, or describing a counterexample.

Michael gave some useful clues in Part 19:

The real issue pops up […] for any nonstandard model N of PA. As people always put it, seen from outsideN has an infinite descending sequence n–1 > n–2 > … , for some nonstandard n. But the Least Number Principle (LNP) manages to survive in N, because the set {n, n–1, n–2, …} is not definable, and so invisible inside N. (The LNP is just the contrapositive form of the Induction Axiom.)

The three-decker sandwich looks much the same. With our (U,ε) glasses on, the infinite descending sequence {n, n–1, n–2, …} isn’t visible—it’s not one of the sets of U. But viewed from outside, from V, we can see the violation of the LNP.

The induction I mentioned earlier becomes the Least Number Principle here.

BS: Maybe without the axiom of infinity (Inf), the axiom of foundation (Found) is no longer enough to prevent infinitely descending membership chains. (I read something yesterday which I think used them together to explain how those are usually ruled out, but I can’t remember where.)

Certainly, the version of Found in Kaye and Wong (the paper referenced in post 8) doesn’t look like it can do this alone. A “downwardly infinite ordinal” would work fine (not violating Found or not(Inf)) as long as it also included the empty set, or even, all the standard natural number ordinals.

So my guess is, sets like that, in nonstandard models of “ZF negating Inf”, play the role of nonstandard numbers, in nonstandard models of PA.

[“ZF negating Inf”, is the modified version of ZF where the axiom of infinity is replaced by its negation. Denoted ZF¬∞. Similarly for ZFC¬∞, below. ed.]

If this is right, maybe these sets could be thought of as “infinite sets that think they’re finite”—kind of the opposite of the “countable sets that think they’re uncountable” in a countable model of ZF.

BTW, there is a reason I think this is important, not just a curiosity, but I’ll save that for later.

JB: You wrote:

Maybe without the axiom of infinity (Inf), the axiom of foundation (Found) is no longer enough to prevent infinitely descending membership chains.

My instinctive reaction is “no, we don’t need to assume there’s an infinite set to use the axiom of foundations to prove there’s no infinitely descending membership chain.”

Let’s see how far I can get turning that reaction into theorems.

At least in classical logic—which is what we’re using in all these posts—we can assume there either is or isn’t an infinite set.

If there isn’t, it’s a bit hard to say what an infinite descending membership chain is, since we can’t index it by the natural numbers (which we don’t have). However, if we know there’s no infinite set, it feels like we should be able to prove there’s no infinite chain of distinct sets, each containing the next, since those sets would form an infinite set.

Furthermore, we can try to prove there aren’t loops of membership where $x_1\in x_2\in\cdots x_n\in x_1$. I bet that for any fixed n we can use ZF including the axiom of foundations to prove no such loop exists. Note that for any fixed n we can state the nonexistence of a membership loop of length n without using natural numbers. However, it seems hard (meaning: to me it seems impossible) to assemble all these statements into a single statement.

If there is an infinite set—or more precisely, if we assume ZF, which includes the axiom of foundations—we can construct the natural numbers, use these to define the concept of an infinite descending membership chain, and prove no such chain exists. I had trouble quickly coming up with a proof, but the proof here is very simple and nice:

“Axiom of regularity” is a synonym for “axiom of foundations”.

I think the whole study of PA is crippled if we remove the axiom of infinity from ZF, since we can’t even prove PA has a model. In this series we’re always feeling free to use the whole power of ZF, and often ZFC.

By the way, the link I just gave shows how you can use the axiom of choice to prove that the nonexistence of infinite descending membership chains implies the axiom of foundations. The idea is simple: if the axiom of foundations does not hold, you can build an infinite descending membership chain… but this requires an infinite sequences of arbitrary choices.

Incidentally, this reply on math.stackexchange (math.SE) is worth reading (unlike the question it’s a reply to). Everything it says seems obvious to me, so if it doesn’t seem obvious, please test my understanding by asking a question!

BS: John wrote:

If there isn’t [an infinite set], it’s a bit hard to say what an infinite descending membership chain is, since we can’t index it by the natural numbers (which we don’t have). However, if we know there’s no infinite set, it feels like we should be able to prove there’s no infinite chain of distinct sets, each containing the next, since those sets would form an infinite set.

This is exactly what I think we can’t prove, from inside this theory ZF¬∞. (Whether or not those sets in that chain are distinct.)

What I guess is:

Without an infinite set to use as a sort of “base” or “indexing system” or “scaffolding”, we have no way to ask what happens if we probe infinitely far into a set, to look for an infinite descending membership chain (whether or not we worry about distinct sets vs. a repeating cycle of sets).

Looking at that wikipedia proof you referenced (which might be the thing I mentioned above, that I read earlier but couldn’t remember where—at least it’s a very similar proof), note that it only goes through if it can assume the existence of an infinite set, which it uses in exactly that way. I don’t see how to repeat a proof like that without having a “conventional” infinite set to start with.

John also wrote:

I think the whole study of PA is crippled if we remove the axiom of infinity from ZF, since we can’t even prove PA has a model. In this series we’re always feeling free to use the whole power of ZF, and often ZFC.

I agree, but I think maybe you’re confusing two questions—removing the axiom of infinity from the theory we’re studying, vs from the theory we use to study it.

When we say “PA is isomorphic to ZF¬∞”, and then “let’s use that fact to study PA”, obviously we can do that studying in the “real” ZF, which still does have Inf.

So in the real ZF, we can prove ZF¬∞ has a model, and clearly we do need to use Inf to prove that—otherwise we’d be constructing a model of a stronger theory using a weaker one!

The math.SE answer says: “Independently of this, be aware that if you have a model of ZFC, it may well be that you, looking at the model from the outside, can find a sequence of elements such that each has the next as an element …”

I think that might be correct (though I am not positive), but based on what MW has said in explaining Enayat’s paper, it seems clear that that model of ZFC can’t be a standard one (using the real membership relation as its own), since being standard would prevent infinite descending membership chains.

Finally, assuming I’m right about my main point above, here is what I think is “really” going on: the conventional infinite set omega is very nicely constructible (by repeatedly adding X into X as a new element, i.e., X∪{X}). This is what lets us use it in proofs like that one in wikipedia. Whereas if the only infinite things that exist are sets with infinite descending membership chains, and/or various infinitely wide sets—but no constructible infinite sequence—then you can’t use them for anything, since you can’t get a function (even as a predicate, let alone as a first-class object) which goes from such a set to one of its elements, forever.

So we need the constructible infinite sets to rule out the useless descending ones, as well as for all the other cool things we can do with them! Without them, there is simply no way to see, from inside the theory, that there is anything wrong with this infinite descending membership chain (whether it has distinct sets or not).

(Well, if they weren’t distinct, but repeated after a standard number of chain links, maybe you could find a theorem that ruled that out, for that specific depth number—as you said. But certainly if they are distinct, there’s no way to see anything wrong with them.)

If you think I’m wrong in my main point, then you have to explain what the nonstandard PA elements are, within a model of ZF¬∞, if they’re not what I’m suggesting.

You (JB) also suggested that we should test your understanding of everything in that math.SE answer. So here is one thing I think is wrong, and one where I just don’t know what he means (and would like to).

That will be because the sequence [of elements such that each has the next as an element] you can see from the outside is not represented as a function 𝑓:ℕ→𝑉 that is itself an element of the model.

I think that’s right (or might be right), but:

So formulas in the first-order language of set theory cannot even speak about whether or not it exists.

That sounds wrong. It sounds more like you could write a predicate which meant exactly that (i.e., “speak about it”)—you just couldn’t show whether or not such a thing exists (and indeed, this will be true in some models and false in others).

(And there must exist such models if ZFC is consistent at all, as a standard compactness argument shows. In fact, every model has an elementary extension that has a descending chain of naturals).

And this is what I barely understand, but would like to fully understand. (For example, what’s a “standard compactness argument”?)

JB: You wrote:

what’s a “standard compactness argument”?

The compactness theorem is an important result in first-order logic. It says that a set of axioms has a model iff every finite subset has a model.

On the one hand this follows easily from the fact that a set of axioms in first-order logic has a model iff it is consistent. Any proof only uses finitely many axioms, so a set of axioms is consistent iff every finite subset has a model.

On the other hand, the compactness theorem is a powerful tool, e.g. for proving the existence of nonstandard models of arithmetic with desired properties.

For example: if PA is has a model, so does PA together with an extra constant c obeying any finite subset of these additional axioms:

c > 0”
c > 1”
c > 2”

etc. That’s easy to see. But by compactness, it follows that there’s a model of PA with a constant c obeying all these additional axioms. And this is a nonstandard model.

Next, I wrote:

However, if we know there’s no infinite set, it feels like we should be able to prove there’s no infinite chain of distinct sets, each containing the next, since those sets would form an infinite set.

You wrote:

This is exactly what I think we can’t prove, from inside this theory ZF¬∞. (Whether or not those sets in that chain are distinct.)

What I guess is:

Without an infinite set to use as a sort of “base” or “indexing system” or “scaffolding”, we have no way to ask what happens if we probe infinitely far into a set, to look for an infinite descending membership chain (whether or not we worry about distinct sets vs. a repeating cycle of sets).

Here’s something I think we can prove using ZFC¬∞. (This seems to need the axiom of choice, which I why I’m writing ZFC¬∞.)

Within ZF we can define the concept of a set equipped with a linear order: reflexive, antisymmetric, transitive, and obeying trichotomy.

We can prove any relation on R has a “transitive closure”: the intersection of all transitive relations on that set containing R.

So, we can write down a predicate P(S) on sets that means “the set S, equipped with the relation ≤ that’s the transitive closure of the relation ∈, is linearly ordered.”

Then we can write down a predicate that means “S is infinite”. There’s actually more than one way to do this, but one is “there is a bijection between S and some proper subset of S”.

Now, I believe that in ZFC¬∞ we can prove “there exists no set S such that P(S) and S is infinite”.

This is a particular way of trying to formalize the idea that there is no infinite chain of distinct sets, each containing the next.

Here’s how I think the proof goes. I think in ZFC¬∞ we can prove “there exists no set S such that S is infinite”. The desired result then follows.

In short, P(S) plays no role whatsoever!

All the work then becomes: can we show in ZFC without the axiom of infinity that the following are equivalent?

1. the axiom of infinity
2. the existence of a set S equipped with a bijection to a proper subset of S.

And the answer seems to be yes; there’s a proof in this somewhat curious document, starting around page 4.

All this may seem like a pointless digression since it has little to do with infinite chains of membership. But this is the sort of thing I meant by my original remark:

However, if we know there’s no infinite set, it feels like we should be able to prove there’s no infinite chain of distinct sets, each containing the next, since those sets would form an infinite set.

Finally, someone on math.SE wrote:

So formulas in the first-order language of set theory cannot even speak about whether or not it exists.

You wrote:

That sounds wrong. It sounds more like you could write a predicate which meant exactly that (i.e., “speak about it”) — you just couldn’t show whether or not such a thing exists (and indeed, this will be true in some models and false in others).

I have trouble knowing what the offending sentence means, so instead of saying it’s wrong I’ll just say “I wish the guy hadn’t said that—what he was saying made sense up to that point, and it would be better if he left that out”.

BS: John, you wrote:

Here’s something I think we can prove using ZFC¬∞. ….

I will look at the “curious document” and think about the proof sketches you spelled out. I withhold judgement until then.

But in the meantime—consider, what actually is the successor operation (in PA, translated into ZF¬∞)?

If the number 3 looks anything like this:

{0, {0}, {0, {0}}, {0, {0}, {0, {0}}}}

then I think it must be the usual S(X) := X∪{X}.

Then observe that in a nonstandard model of PA, a nonstandard number can have that operation inverted, countably many times (each time becoming a distinct nonstandard number).

If that is correct, it proves these infinite descending chains of membership (of distinct sets) exist, in that sort of model. (And that if these models are studied inside ordinary ZF, they don’t use its membership relation as their own.)

If that is wrong, it means S(X) is defined as something entirely different. Then the question becomes, what could it possibly be defined as, that (on nonstandard numbers) could be inverted countably many times?

[Later:] I read most of the “curious document”. I reread your proof sketches, but I mostly have to take your word that they can be formulated without using Inf (but that does seem plausible to me).

You did seem to change your goal partway through—you abandoned P(S) even though that was the only thing connecting what you were saying with “membership chains”. I guess you decided it was sufficient to just prove “there is no infinite set”.

For the sake of argument, let’s suppose your proof sketches are fully correct. I think where we differ is in how to interpret your results. I think you are doing all this inside ZFC¬∞—of course it thinks there are no infinite sets! Just like PA thinks all its numbers are finite (if it can even formulate that statement about them). The question is what we would think, looking at a nonstandard model of ZFC¬∞ from outside, but in our own minds using ZF with Inf, not anything negating Inf.

Going back to your P(S), and your proof sketch using it within ZFC¬∞ that concludes “there are no infinite downward membership chains”, maybe that too can work, but it’s still being done from inside the theory of the model we’re studying, so all it shows is that there are no downward membership chains that it thinks are infinite. But since it explicitly denies that anything is infinite (more or less), that can’t be too surprising!

To put it another way—within PA, every number seems finite (whether or not it’s nonstandard). By analogy, presumably within ZFC¬∞, every ordinal seems finite too, even if it corresponds to a nonstandard number in PA. So, again, maybe all we end up proving is that there is no downward membership chain that “seems infinite to ZFC¬∞”—but maybe such chains exist which would seem infinite to an outside theory of ordinary ZF which was studying a nonstandard model of ZFC¬∞. (So, for instance, there would be bijections between them and proper subsets of them visible to the “outside examiner”, but not visible within this model, since the relations doing that bijecting wouldn’t exist inside this model.)

As support for this view, here is how the Kaye and Wong paper defines “ordinal” within ZF¬∞ (which their text seems to equate with “finite ordinal”):

4 The Ordinal Interpretation

The finite ordinals in the standard model of ZFC resemble the usual natural numbers. In a world without infinite numbers, one would expect the class of all ordinals to satisfy the ordinary rules of arithmetic.

Definition 4.1 Trans(x) (‘x is transitive’) is the formula

∀y,z (z ∈ y ∧ y ∈ x → z ∈ x),


and x ∈ On (‘x is an ordinal’) is

Trans(x) ∧ ∀y,z∈x (y ∈ z ∨ y = z ∨ z ∈ y).


I don’t see those formulas failing to apply to these “infinite descending ordinals” I’m suggesting as the interpretation of nonstandard numbers. If anything rules them out, it’s not that definition.

(BTW, I also don’t see how the usual rules of ordinal arithmetic (as I very slightly understand them) can work properly for the apparent order types of nonstandard numbers in PA. That is a mystery I’d like the answer to.)

[Later:] I just noticed this, on the same wikipedia page you linked to, which seems to say what I’m saying:

Notice that this argument only applies to functions f that can be represented as sets as opposed to undefinable classes. The hereditarily finite sets, Vω, satisfy the axiom of regularity (and all other axioms of ZFC except the axiom of infinity). So if one forms a non-trivial ultrapower of Vω,, then it will also satisfy the axiom of regularity. The resulting model will contain elements, called non-standard natural numbers, that satisfy the definition of natural numbers in that model but are not really natural numbers. They are fake natural numbers which are “larger” than any actual natural number. This model will contain infinite descending sequences of elements. For example, suppose n is a non-standard natural number, then n–1 ∈ n and n–2 ∈ n–1, and so on. For any actual natural number k, nk–1 ∈ n–k . This is an unending descending sequence of elements. But this sequence is not definable in the model and thus not a set. So no contradiction to regularity can be proved.

JB: You wrote:

You did seem to change your goal partway through—you abandoned P(S) even though that was the only thing connecting what you were saying with “membership chains”. I guess you decided it was sufficient to just prove “there is no infinite set”.

I know what you mean, but that was my goal all along. I’d summarized the whole argument in a previous comment:

However, if we know there’s no infinite set, it feels like we should be able to prove there’s no infinite chain of distinct sets, each containing the next, since those sets would form an infinite set.

The argument is pathetically simple: if there are no infinite sets, then there’s no infinite chain of distinct sets, since they would be the members of an infinite set.

It’s so simple it’s almost a joke! At least it should make you smile. The business about “each containing the next” is just a red herring, as far as the logic of this argument goes.

However, I found it interesting to see if I could formulate a predicate P(S) meaning “S is a chain of sets, each containing the the next”, without needing a set to index S. So that was a separate sideshow. There are often different ways to try to formalize an idea, and while P(S) is my attempt to formalize “S is a chain of sets, each containing the the next”, what it actually says is “the set S, equipped with the relation ≤ that’s the transitive closure of the relation ∈, is linearly ordered.”

I think you are doing all this inside ZFC¬∞—of course it thinks there are no infinite sets!

Definitely! I’m working in ZFC¬∞. That’s what I signaled from the start:

My instinctive reaction is “no, we don’t need to assume there’s an infinite set to use the axiom of foundations to prove there’s no infinitely descending membership chain.”

At least in classical logic—which is what we’re using in all these posts—we can assume there either is or isn’t an infinite set.

If there isn’t, it’s a bit hard to say what an infinite descending membership chain is, since we can’t index it by the natural numbers (which we don’t have). However, if we know there’s no infinite set, it feels like we should be able to prove there’s no infinite chain of distinct sets, each containing the next, since those sets would form an infinite set.

So in this branch of my argument I’m assuming ZFC¬∞. In the other more interesting branch I assumed ZFC, which includes Inf as usual.

It sounds like you are interested in something more complicated, like a model of ZFC¬∞ inside ZFC. That raises other questions. I’m sorry if everything I’ve been discussing is tangential to your real interests; however, I wanted to get some basic points nailed down.

BS: John, you wrote:

Definitely! I’m working in ZFC¬∞. That’s what I signaled from the start: ….

Ah, I now think we have just been misunderstanding each other all along. I think we have both been right, but were talking about different things.

So let me try to start over, hopefully with more clarity, and also add a bit of motivation.

My level of interest in “nonstandard models of PA” is much higher than my level of knowledge about them. I am especially interested in learning more about the “nonstandard numbers” they contain.

(Before I go any further—I am using “nonstandard” in the usual sense associated with “nonstandard number”. Unfortunately this is not equivalent to (the opposite of) another use of “standard” that has come up in this blog post series, and will also come up below, namely, “when a model uses the same membership relation as the outer theory studying it”. When that second use comes up far below, I’ll point this out again.)

I have a motivation for better understanding “nonstandard numbers”, involving a possible way of applying them; but discussing that would take us too far afield for now. Suffice it to say, I’d like to know a lot more about them (and various models containing them) than I do now. The “most advanced” things I know now are things like this (if any of these are wrong, someone please tell me!):

• If T is consistent, but T thinks T can prove 0=1, then in any model of T, there is such a proof (coded as an object in that model), but (seen from outside) the length of that proof is “nonstandard”, so it doesn’t translate into a “valid proof in the outside world” of 0=1.
• In a nonstandard model of PA, the “set of all numbers” (necessarily seen from outside, since there is no such object in its models) has an order type equivalent to one copy of ℕ followed by infinitely many copies of ℤ, arranged “densely” (in a sense whose precise details I don’t know, i.e., I don’t know what happens at “each end”). (The only part of that I could probably prove is “it starts with ℕ, and the rest are partitioned into an infinite number of copies of ℤ, which don’t overlap”. I can prove the number of ℤ-copies is infinite, since each multiple of some nonstandard number has to be in a different one. I don’t know how to prove anything else about them.)
• That “order type picture” tells you how to “compute” successor or its inverse, in an obvious way. I have heard, but have no idea how to prove, that if you additionally wanted some finite representation in which you could also compute arbitrary sums and products, you can’t get one, for any nonstandard model. (I don’t know a precise statement of that result, but would like to know.)
• But all models think they’re standard. For example, there is no predicate p(x) within PA that means “x is nonstandard”. Every p(x) that you might intuitively think means that, is true of no elements in any model.
• Even from outside of a model M, there is a sense in which “M is nonstandard” is a relative term:
• I.e., you might be thinking about a model M of T from inside a larger model M′ of some theory T′, which knows T is consistent (since it thinks M is a model of T, and the existence of a model of T proves T is consistent).
• Then T′ might be able to prove “M is standard” or “M is nonstandard”; or, lacking such proofs, specific models M′ of T′ might “satisfy” “M is standard” or “M is nonstandard”. (More precisely: if there is such a proof, all models M′ of T′ will agree with its result; otherwise, different models M′ will differ on that. BTW I’m assuming without stating that whichever “outermost” theory I’m considering here is consistent.)
• But if T′ proves “M is standard”, or even if M just looks standard to M′ (without proof in T, and perhaps without proof in T′), M might still look nonstandard from outside T′ and M′, if (in that farther out point of view) M′ is a nonstandard model of T′! (Whether this can happen in the other direction—M′ thinks M is nonstandard, but from outside M′, M can look standard again, provided M′ looks nonstandard from there—I don’t know, and would like to know!)
• Other things can also be “relative to point of view” in this way, e.g. whether some infinite set is countable.

So—when I was told that

ZF with infinity negated (let’s call it ZF¬∞) is “basically the same” as PA

and given the Kaye/Wong paper which formalizes this, I thought, “aha, perhaps being able to view nonstandard numbers as sets will give me a bit more intuition about how they behave! (And with luck and care, perhaps some of that intuition will be valid.)”

But the Kaye/Wong paper doesn’t say anything about nonstandard models of either PA or ZF¬∞, so I have to try to infer (or guess) facts about these, within this correspondence, from how they define the correspondence, and from what they say about how it works in the standard case.

Before I start on that, observe the situation:

• we’re looking at some model M of PA from some point of view in which M looks “nonstandard”.
• we’re going to see M, the model of PA, as in some sense “equivalent” to a model M2 of ZF¬∞.
• so it shouldn’t surprise us if “M looks like a nonstandard model of PA” will turn out to be equivalent to “M2 looks like a nonstandard model of ZF¬∞”.
• but since we know “M thinks it’s standard”, we fully expect to also find “M2 thinks it’s standard”.

So, about the specific correspondence, following the Kaye/Wong paper—fortunately, it’s very simple! We interpret the number m as a set, by expressing it as a sum of powers of 2, 2i, and saying each i in that sum is one of the members of m (and m has no other members). (This is the “Ackermann interpretation” of (I would say) ZF¬∞ within PA, i.e., of elements of PA as being elements of ZF¬∞, in a 1-1 onto correspondence.)

This lets us spell out how numbers in PA correspond to sets in ZF¬∞. Some of those sets in ZF¬∞ will also be Von Neumann ordinals—let me call them V(k) if they are the k-th finite one of those.

So (interpreting the plain numbers I write here as numbers in PA or as whatever set they correspond to in ZF¬∞, whereas V(k) is defined first as a set in ZF¬∞, and only from that as a corresponding number in PA):

0 is the empty set {} == V(0)
1 = 20 = {0} = V(1)
2 = 21 = {1} = {{0}}
3 = 20 + 21 = {0, 1} = {0, {0}} = V(2)
7 = 20 + 21 + 22 = {0, 1, 2} = {0, {0}, {{0}}}
11 = 20 + 21 + 23 = {0, 1, 3} = {0, {0}, {0, {0}}} = {V(0), V(1), V(2)} = V(3)

and so on.

Let’s inductively define V(k). The PA expression for union, in general, is not so simple; but if we happen to know the sets are disjoint, it’s just the sum of the numbers they correspond to. So

V(k) = V(k–1) ∪ {V(k–1)}

becomes

V(k) = V(k–1) + 2V(k–1)

which together with V(0) = 0 recursively defines all the V(k).

(I’m glad you (John) indirectly induced me to spell out the correspondence like this, since it shows me two misunderstandings I had: (1) I sometimes assumed V(k) had a simpler formula in terms of k, within PA, than it does; and (2) I had not sufficiently noticed Kaye and Wong’s statement that “[the ordinal] interpretation is clearly not inverse to the Ackermann interpretation”, and had sometimes been implicitly assuming they were the same, and ordinals in ZF¬∞ covered all numbers in PA. I think my comment about “what form PA’s successor function S() could take inside ZF¬∞” betrays both misunderstandings. This doesn’t affect my main points, though.)

Kaye and Wong then go on to “define an inverse to the Ackermann interpretation”, so the formula above for V(k) (with k being an ordinal index, and V(k) being the number within PA which represents that set) remains true in both directions of that correspondence.

Now we can ask: given that correspondence, what do nonstandard elements of PA look like, as sets in some corresponding model of ZF¬∞?

Here is the part where our misunderstanding presumably arose—I perhaps left it implicit that, since I wanted to see these elements from a point of view that already saw them as nonstandard, it might well need to be a point of view “seeing ZF¬∞ from outside”, so it’s safer to just assume it is that, to start with. Furthermore, since I believe Inf, I don’t want to cripple my own understanding by denying Inf, which is a reason to let the outermost formal point of view also believe Inf, and therefore, not be ZF¬∞.

So for both reasons, I implicitly did assume, and now explicitly will assume, that:

• there is an outer theory of ZF with a model M′ (corresponding to “my own beliefs”);
• M′, which “thinks” using ZF, is looking at a model M of PA (so M is some object in M′, i.e., a set);
• M′ is also looking at an equivalent model M2 of ZF¬∞ (with M2 also being a set in M′).
• Furthermore, M′ knows (or believes) almost everything I know (or believe):
• M as a model of PA, is equivalent to M2 as a model of ZF¬∞ (under the specific “Ackermann interpretation”, and its inverse, as defined in Kaye and Wong and partially spelled out above);
• M (as a model of PA) is a nonstandard model of PA. (That is, it contains nonstandard numbers. I’m not bringing in the other use of the term “standard” until much later.)

(The only relevant thing I believe, which M′ doesn’t know, is that M′ exists, and therefore, ZF is consistent.)

(Maybe I should rename M to MPA and M2 to MZF¬∞. I will sometimes use those below, when I think it makes it clearer.)

So, consider a nonstandard element (number) in M, namely n.

(We (i.e., M′) know n is nonstandard—M doesn’t know that, of course.)

Let’s use this Ackermann interpretation to understand n as a set in MZF¬∞ (from the same outer point of view, namely, within M′.)

We start by knowing n is larger than any standard (finite) n, and therefore has at least one 1 in its binary representation at a position k (i.e., corresponding to 2k in the sum of positionally-weighted binary digits which equals n), where k is also larger than any finite number, i.e., also nonstandard.

So we have (in M′): if n is nonstandard, then there exists k such that k < n (in PA), k is a member of n (in ZF¬∞), and k is nonstandard (in PA). (I am saying ‘k’ in either PA or ZF¬∞, and also saying “in PA” when it might be more precise to say “in M”, etc. I think it’s all still clear enough—let me know if not.)

So, clearly there is an infinite descending chain of membership, within MZF¬∞, starting with any set in MZF¬∞ which corresponds to a nonstandard number in M under the Ackermann interpretation.

But, again, this chain only “looks infinite” from this outer point of view of M′ (using the theory ZF)—the same point of view from which n “looks nonstandard”.

Within PA itself, we can “prove n is finite” by observing something like “for sufficiently high n0, every n1 > n0 is also > n”. Or we could just exhibit the numeral 1+1+…+1 which equals n, and (in accompanying commentary, since this can’t be done formally) point out that it has finite length (though M′ would see that same numeral as having nonstandard length).

Similarly, within MZF¬∞ and its theory ZF¬∞, we can “prove there is no infinite descending chain of membership”, probably in general (like in the proof you sketched), and certainly for n specifically (just by listing out the entire “tree” of memberships by which n is constructed, and saying “see, it’s finite”).

But M′ would still understand that:

• n is nonstandard;
• in MZF¬∞ the set corresponding to n does have an infinite descending chain of membership;
• that the proofs above, in each sub-theory, which are about that specific n being finite in all senses, have nonstandard length;
• whereas the general proofs in those sub-theories, which say “for all n, n is finite in all senses”, have standard length.

(Here is where the other sense of the term “standard” is going to enter in:)

One additional conclusion we can make—since M′ sees MZF¬∞ as having “an infinite descending chain of membership”, but M′ is a model of ZF which can prove it has no infinite descending chains of membership, clearly MZF¬∞ must not be a “standard” model within M′—that is, the membership relation used in MZF¬∞ must not be the “real” one used in M′. (Unfortunately I don’t think this use of “standard” is equivalent to the opposite of my uses of “nonstandard” above—I think they are two related but different terms, with the same name.)

Thus, M′ doesn’t think its own set n, corresponding to n in M = MPA, and the equivalent set n in MZF¬∞, really has an infinite descending chain of membership—it just thinks it has an infinite descending chain of the-membership-relation-used-in-MZF¬∞.

(But, to go “farther outside” for the first time—if M′ did think MZF¬∞ was a “standard model”, this would imply it was also standard (i.e., not nonstandard) in my earlier sense; but then if an outer M″ thought M′ was nonstandard in my earlier sense, it might still see M as also being nonstandard in that sense.)

So, I hope that is clear (except for the two distinct uses of “standard”!), convincing, and fully extricates us from mutual misunderstanding. Let me know if not!

Although I think this is now a peripheral point to our main discussion, I should add that, skimming Kaye and Wong section 5, where they show that even to fully define the inverse Ackermann correspondence within ZF¬∞ requires the axiom TC (roughly, “transitive closures over downward membership exist”), I suspect the same would be true of any argument trying to “extract” a membership chain into a single outer set representing it as a relation, from which you might talk about whether or not that set was infinite. So I suspect that, even within ZF¬∞ (and using “infinite” as “infinite from its point of view”), it’s possible you can’t prove (in fact, perhaps you can’t even express—I’m not sure) “there are no infinite descending membership chains”, without also assuming TC. I didn’t try to check any of this in detail, since I don’t yet fully follow their section 5.

JB: Bruce wrote:

In a nonstandard model of PA, the “set of all numbers” (necessarily seen from outside, since there is no such object in its models) has an order type equivalent to one copy of ℕ followed by infinitely many copies of ℤ, arranged “densely” (in a sense whose precise details I don’t know, i.e., I don’t know what happens at “each end”).

In what follows we’re looking at a fixed nonstandard model of PA in a universe of sets obeying ZF.

“Densely” here means that between any two distinct copies of ℤ there is another copy of ℤ. It follows that between any two distinct copies of ℤ there are infinitely many copies of ℤ.

This is not terribly hard to show. Suppose we have x in one copy of ℤ and y in some other copy of ℤ. Suppose without loss of generality that x<y. Then our model has an element $a=\left\lfloor \frac{x+y}{2}\right\rfloor$

with

x < a < y

and one can show that a is in a copy of ℤ that’s distinct from the copies of ℤ containing x and y.

The main hard part in proving this is showing that the floor function can be defined in PA, and proving in PA that it obeys all the usual inequalities one expects, like $x<\left\lfloor \frac{x+y}{2}\right\rfloor

whenever y–x>1. Somewhere there should be a book on PA that develops huge amounts of basic math like this starting from the axioms.

In the case at hand, saying that x and y live in distinct copies of ℤ means that

y–x > n

for each standard n.

(Understanding this requires that we know what we mean by the standard natural numbers in our model of PA. Suffice it to say that they are the smallest submodel of PA. Whenever you or I mention ℕ in this particular entry, this is what we’re talking about: the standard natural numbers. Similarly, when we talk about “a copy of ℤ” we’re talking about a standard copy, i.e., ℕ ∪ –ℕ.)

Using our assumption that

y–x ≥ n

for each standard n, we can show for each standard n that

ya ≥ ⌊n/2⌋

and

ax ≥ ⌊n/2⌋

Again, the only hard part about this is showing that all such basic facts can proved using PA. So, we see that a is in a different copy of ℤ from either x or y.

That’s how we know that between any two copies of ℤ there’s another. There’s also one between any copy of ℤ and the copy of ℕ down at the bottom. There’s also a copy of ℤ to the right of any copy of ℤ: that is, they keep on going forever.

By the way, in getting a sense of this stuff it’s good to know Cantor’s theorem: any linearly ordered set that’s countable, dense, and has no lower or upper bound is isomorphic to ℚ as an ordered set. Here ‘dense’ means that between any two distinct elements there’s another element.

So, one can show a countable nonstandard model of PA consists of a copy of ℕ followed by ℚ’s worth of copies of ℤ.

Bruce wrote:

I have heard, but have no idea how to prove, that if you additionally wanted some finite representation in which you could also compute arbitrary sums and products, you can’t get one, for any nonstandard model. (I don’t know a precise statement of that result, but would like to know.)

This is Tennenbaum’s theorem, and a precise statement and sketch of a proof is here:

• Wikipedia, Tennenbaum’s theorem.

In fact neither addition nor multiplication can be computable in a nonstandard model!

Wikipedia only sketches the proof for addition. It’s very nice. It uses the fact that we can encode any finite set S of natural numbers as a single number n, which is divisible by the i-th prime iff i is in S. (Many different n encode the same set S.) In a nonstandard model, “finite” takes on a new meaning, so some sets of standard numbers that you and I would normally call infinite can be encoded in a nonstandard number n.

Coda

MW: Unfortunately, I wasn’t able to participate in this thread as it unspooled. But you’ve covered much of what I would have said anyway! Still, I’ll add a few remarks.

(1) The Foundation axiom. People often state this informally as, there is no infinitely descending chain of sets s1s2∋… . But if we’re going to go wading into the weeds, we should use the exact formulation:

Any non-empty set has an ∈-minimal element. In other words, if y is non-empty, then there is an xy such that no element of x is an element of y. In other words, xy = ∅.

This makes sense even without the Axiom of Infinity (Infinity for short). To formalize the “descending chains” version, we ask first, what does ‘ω’ mean without Infinity? Well, the definition of ω can be formalized in L(ZF); you end up with a formula is-omega(x), and you can prove in ZF (even without Infinity) that there is at most one set x satisfying is-omega(x). Infinity basically says that there exists a set satisfying is-omega.

The “descending chains” version of Foundation looks like this:

There is no function f with domain ω such that for all n ∈ ω, f(n+1) ∈ f(n). In other words, there does not exists a set w satisfying is-omega(w), and a function f with domain w, such that for all nw, f(n+1) ∈ f(n).

This is trivially true in ZF¬∞: no ω means no functions with domain ω! In ZF without Infinity, you can prove that the ∈-minimal version of Foundation implies the descending chains version. Thus: the range of such a function would be a nonempty set without an ∈-minimal element. (Doesn’t matter if elements repeat in the chain or not.) The other direction needs the Axiom of Choice (AC), or at least the weaker form called Dependent Choice. The idea is quite simple: if y is a nonempty set without an ∈-minimal element, choose an element s1 of y, then choose an element s2 of s1y, then choose an element s3 of s2y, ad infinitum. (All those intersections are nonempty because y has no ∈-minimal element.)

(2) Standard and transitive. As mentioned, a model (U,ε) of ZF is standard if the element-of relation ε is the “actual” ∈ in the “real world” (or whatever (V,∈) is our outermost model). (Too bad about this use of ‘standard’, but the terminology is, ahem, standard.) John noted that the ZF formula ‘xy’ has the same meaning in (U,∈) and the containing (V,∈). (In the jargon, the formula is absolute between U and V.) On the other hand, a so-called bounded quantifier, like ‘∀xy’ won’t always have the same meaning, unless (U,∈) is also transitive. ‘Transitive’ means that if xyU, then x ∈ U.

If U isn’t transitive, then y ∈ U might have elements in V that didn’t make it into U. Extreme case: y might “look like the empty set” in U, though nonempty in V. In general, to see what yU “looks like” in U, you have to throw out any elements of y that don’t belong to U, then go through the remaining elements and throw out any of their elements not in U, etc.

If U is a standard transitive model of ZF, then its ω is the “actual” ω. Transitivity isn’t such a big deal for this reason: you can define an isomorphism from any standard model of ZF to a standard transitive model. It’s called “Mostowski collapse”, and essentially just involves the “throwing out” process of the last paragraph. (The Marie-Kondo-ization?) So the ω of a standard model is always isomorphic to the “actual” ω, at least.

(3) “Invisible” sets. This is the key to how numbers in a nonstandard model M of PA can “look standard” from inside M, and Foundation can fail for a nonstandard model (U,ε) of ZF.

First PA. The contrapositive of the Induction Axiom schema is the Least Number Principle: every definable nonempty subset of M has a least number. PA has no variables explicitly ranging over sets, so instead this is stated as an axiom schema. Thus: if φ(x) is a formula in L(PA), then “there exists an x such that φ(x)” implies “there exists a least x such that φ(x)”. A nonstandard element n of M has many associated sets that thumb their noses at the Least Number Principle (LNP): {n, n–1, n–2,…}, or the set of all nonstandard predecessors of n, and many others. The LNP replies, “I can’t see you! I can see only definable sets!”

Same story for Foundation in a nonstandard ZF model (U,ε). (U,ε) must satisfy Foundation as seen “from the inside”. But seen from V, it might not. There could be an infinite descending chain with f(n+1) ε f(n), but the function f would exist only in V, not in U. Likewise, we could have a subset y of U without an ε-minimal element, but y itself would not be an element of U. It’s an “invisible subset”. The range of the “infinite descending chain” is an example. Foundation in (V,∈) has no gripe with this y: it cares only about ∈-minimal elements, not a fig about ε-minimal elements.

For a standard (U,∈), infinite descending ε-chains are actually ∈-chains in V. So V’s Foundation forbids them. If y is a subset of U, then y has an ∈-minimal element x in V. Because y is a subset of U, x also belongs to U. Because x is ∈-minimal in V, x∩y= ∅ in V, and so a fortiori in U.

Upshot: any standard model (U,∈) satisfies Foundation “as seen from the outside”. Its ω, therefore, “really” satisfies the Least Number Principle, even for “invisible subsets”—subsets visible only from V. This excludes nonstandard models of PA.

That’s all, folks!

TOC