TOC Post 7 Post 8

Posts 7 and 8 developed an extensive comment thread, mainly between Bruce Smith and John Baez. It was hard to follow in that format, so I converted it to a separate webpage.

Topics: (a) Why do standard models of ZF have standard ω’s? (b) Interactions between the Infinity Axiom and the Foundation Axiom (aka Regularity). (c) The compactness theorem. (d) The correspondence between PA and “ZF with infinity negated”: nonstandard numbers vs. ill-founded sets, and the Kaye-Wong paper (cited in post 8).

TOC Post 7 Post 8

### Like this:

Like Loading...

*Related*

Michael, thanks for this much more readable version of that comment thread, and for your own clarifying “Coda” at the end.

I have just one question about your “Coda”.

If I understand you correctly, the “there are no infinitely descending membership chains” version of Foundation is true in ZF¬∞, but only because ZF¬∞ has no set we can consider ω, so it has no function with ω as domain.

In that case, though we can say “yes, that version of Foundation is true”, don’t we need to refrain from interpreting that as meaning “there are no infinitely descending membership chains” (like it does mean in ZF)? It seems like what it really means in ZF¬∞ might be more like “infinite? what does

thatmean?”I eagerly await Michael’s answer, but I can’t resist taking a stab at this question myself.

I think the concept of infinity is perfectly well-defined in ZF¬∞: it’s just a theorem in ZF¬∞ that infinite sets don’t exist.

This is why, on that separate webpage, I mentioned that in ZF¬∞ we can easily prove

“there is no infinite set S such that the relation ≤, the transitive closure of the relation ∈, makes S linearly ordered.”

By the way, for anyone reading this who hasn’t been paying careful attention: there’s nothing spooky about ZF¬∞; it’s just a good way to axiomatize the theory of

finitesets. As a category theorist, the category of finite sets is one of my favorites. Like the category of sets it’s a topos, which means one can do a lot of math in it. And it has a lot of nice characterizations: for example it’s the free category with finite coproducts on one object, and also the free category with finite colimits on one object, and also the free symmetric monoidal category on a commutative monoid object. (In each case, the object is 1.)Yes. There are no infinite descending chains in ZF¬∞ because there is no infinite

anythingin ZF¬∞. However, various versions of “infinity” can still bedefined, if only to have their existence denied. In contrast, in some formal systems (for example, the first-order theory of groups) you can’t even express the concept. Or to be precise (always a good idea with these matters), in L(Group), there is no sentence (closed formula) φ which is true in all infinite groups and false in all finite ones. The beginning of post 17 gives an even simpler example of this phenomenon (the first-order theory of equality).[I will try to step back from the “trees”, and see again the “forest” that originally interested me.]

John wrote:

(That is, if we’re thinking about a nonstandard model M, of PA, “from outside”, representing its numbers (via some encoding of our choice) using our own (presumably standard) numbers, and we want to compute the model’s versions of addition or multiplication as functions on our encoded forms of its numbers.)

==

When I first heard that result, some time ago, I interpreted it to mean “neither addition nor multiplication (of encoded numbers of any specific nonstandard model M of PA) can be defined in an understandable way”.

But maybe that reaction was wrong. In computational complexity (which I am much more familiar with than set theory), “computable”, aka “recursive”, is at the very far

highend of “complexities we consider reasonable to think about”. But here in set theory, it’s more like the lowest end of that scale!So this raises the question — is it possible to define a specific nonstandard model M of PA (represented inside a presumed-standard model of PA or ZF), in which M’s arithmetic functions (represented as functions on some of the outer system’s numbers or sets, which we view as encoding M’s numbers in some manner we choose) are defined by specific formulas, even though we know they can’t be recursive?

(It would not bother me if M was uncountable, which seems possible if we represent M in ZF and are free to encode its numbers as whatever sets we like. Nor would it bother me if what I seek can only be done for a few, very special, nonstandard models M of PA. I am just looking for at least one nonstandard M whose addition and multiplication can be implemented “from outside” using specific formulas, which we could then try to understand.)

That’s a great question: what’s the least complex nonstandard model of PA we can “get our hands on”?

This is beyond my powers to answer well, but I’m trying to learn logic so yet again I’ll take a stab at it, and then see what Michael says.

The only way I really understand for getting nonstandard models of PA is via Gödel’s completeness theorem, which says any consistent theory in first-order logic has a model. Assuming PA is consistent, we can get a consistent extension of PA that can only have nonstandard models by adding an extra axiom like ¬Con(PA) or an extra constant c and an infinite set of axioms like

c > 0

c > S0

c > SS0

…

Then Gödel’s completeness theorem says this theory has a model.

But how do we actually

getsuch a model? Here we have to dig into the proof of Gödel’s completeness theorem. I looked into this and found the proof can’t be made “effective”, but we can prove it using the Weak König’s Lemma, a very weak relative of the Axiom of Choice that says any infinite subtree of the full binary tree (the tree of all finite sequences of 0’s and 1’s) has an infinite path.I guess this principle lets us take the various choices needed to stitch together a model and actually build the model.

Wikipedia says:

So that’s the best I can do. But someone else might be able to do much better. This Weak König’s Lemma idea is a way to get a model of

anyconsistent theory (with countably many symbols) in first-order logic. We might be able to do much better when trying to get a nonstandard model of PA.“Reverse arithmetic” is the project of seeing exactly what amount of logical strength your axioms need to prove various famous theorems. The article is saying that over a weak system of mathematics called RCA

_{0}, not only can you prove Gödel’s completeness theorem for countable languages if you assume the Weak König’s Lemma, you can prove the Weak König’s Lemma if you assume Gödel’s completeness theorem!You can use the so-called Arithmetized Completeness Theorem (ACT) to obtain a nonstandard model of PA; we’ll be saying more about the ACT in the (near?) future in this series. In the meantime, Kaye’s §13.2 discusses the ACT, also Kossak and Schmerl’s §1.12. Or this webpage from the excellent set of online notes by Tin Lok Wong that I recently ran across. Wong is a student of Kaye’s, and co-author of the Kaye&Wong paper we’ve discussed.

The nonstandard model that the ACT serves up has “complexity Δ

_{2}“. Δ_{2}belongs to thearithmetic hierarchy; see my Topics post. I will say this here: Δ_{2}is the “next level up” after “recursively enumerable”.I confess that I’ve never gone through the details of this Δ

_{2}fact about the ACT. The conclusion of the Weak Kőnig’s Lemma is that an infinite branch exists in a certain tree. Now, assertions like “there is an infinite …” typically are Π_{2}; that is, they’re of the form ∀x∃y[something simple]. To be Δ_{2}, you also need an equivalent assertion of the form ∃x∀y[something else simple]. That’s the piece I haven’t studied.If I get around to patching these knowledge gaps, I might edit this comment, or if it’s a long story, write a post.

Great! Hearing that there’s a nonstandard model of PA is exactly the sort of thing I wanted, and presumably Bruce as well. I’m somewhat familiar with the arithmetical hierarchy since I did my undergrad thesis on computable analysis, and I bet Bruce had something like that in mind in his original question. Someday maybe we can see in detail how to get a model—but even without that, the existence of such a simple model is reassuring!

Thanks, John and Michael, that is all very interesting, and indeed “you can make one with complexity Δ

_{2}” is exactly the sort of thing I was hoping to hear. (And, of course, how, and in what different ways… but I will follow the references on that.)I “read” the lecture notes for “The Arithmetized Completeness Theorem” by Tin Lok Wong (Michael’s ref above). I put “read” in quotes, because I probably understood at best 10% of it, though as far as I can tell, it’s written clearly.

In theory, I’d like to understand it all; but in practice, my time budget for further understanding is limited enough that I will not get that “deep” in the forseeable future. So I will need to continue to ask you (and anyone else willing to comment) to interpret that sort of thing for me!

It’s interesting to naively attempt to construct a nonstandard model of arithmetic and see what choices need to be made. We start with the standard ℕ and then throw in one nonstandard number

x. Then we need to put inx+1,x+2, … andx—1,x—2,… so we get a copy of ℤ containingx. Then for any positive rationalqwe need to put in a nonstandard number ⌊qx⌋. Each of one of these, it’s easy to show, is in its own separate copy of ℤ.So now the order type is right for a countable nonstandard model: a copy of ℕ followed by a dense collection of copies of ℤ with no lower bound and no upper bound.

Yet we’re barely getting started! The number is bigger than all those we have so far, and then and then… where is any fast-growing function that we can define in PA. I guess doesn’t need to be recursive, just

definable.On the other hand, the number is less than all those we have seen so far, and is even smaller, and so so on.

And at this point we start seeing how tricky it gets. I believe there are functions that are definable in PA, for which we can prove they are monotone increasing, but for which its unprovable that

for all sufficiently large

and also unprovable that

for all sufficiently large

In such a case how can we settle which is bigger, or Maybe in some cases we somehow

cansettle it using PA (though I don’t see how). But I bet that at least in some cases wecan’t. Then we get a choice: there will be a model of PA with and another model withSo we’ve got lots of choices to make — and we’re just getting started. And we somehow have to make them in a consistent fashion.

So it’s intuitively clear why something like the Weak König Lemma would show up. We’re walking down Borges’ Garden of Forking Paths. There are infinitely many routes that go on forever, but no way to tell when we make a turn that will eventually lead us into a dead end.

That is fascinating. (I had realized the part about rational multiples and powers of x, but not about f(x) for faster-growing functions f, or about ⌊log x⌋, etc.)

Suppose f and g are definable in PA (and provably total, increasing functions in PA), and PA

doesprove f(n) > g(n) for all sufficiently large n.Isn’t it still possible that “sufficiently large”, in this case, happens to be “larger than x”? Meaning that we might

stillhave to decide, for each model, whether f(x) > g(x).Another question: if we don’t need to stick with a countable model, are we free to add ⌊qx⌋ for every positive

realq? (I think it remains clear that each q lives in its own copy of Z.) (I am guessing the answer is “yes; in fact you can do that for all q in the positive part of any ordered field”.)Bruce wrote:

I’m afraid so. Proving that f(n) > g(n) for all sufficiently large n is weaker than exhibiting a specific standard N and proving that f(n) > g(n) for all n > N. So in principle we prove in PA that f(n) > g(n) for all sufficiently large n, yet in some nonstandard model only have f(n) > g(n) for all n greater than some

nonstandardN. I’m notsurethis is possible. It would be cool to get a concrete example of this phenomenon.I’m very weak in my ability to prove there are nonstandard models with chosen properties: I basically just have the compactness theorem at my disposal, which says if I have a bunch of axioms, and any finite subset of them can be consistently added to PA, then I can consistently add

allof them to PA… so the resulting axiom system must have a model. Maybe we could use this to do what you want here.Anyway, for what it’s worth, I

feelthat what you’re saying here is true.And then it should be pretty easy to show each of these numbers ⌊qx⌋ must live in its own copy of

Just to be clear: the distinction John is making is between

PA ⊢ ∃

N∀n>Nf(n)>g(n)and

there is an

Nsuch that PA ⊢ ∀n>Nf(n)>g(n)The first

Ncan be nonstandard, but the second is implicitly assumed to be standard. I would try to cook up an example using the idea that PA+¬Con(PA) is consistent. So in a model of this (I once called it “insecure PA”), there’s a proof with nonstandard length of a contradiction, but no proof with standard length. Maybe I’ll think about this the next time I take the dog for a walk.I

thinkyou can construct your model with ⌊qx⌋ for every positive realqusing the compactness theorem. You need to use an uncountable language, but that’s OK. Here’s how it goes: first add the new constantxand get your nonstandard modelMwith it. Now for a new language: add a constantcfor every_{q}q∈ℝ,q>0. The intent is thatcstands for ⌊_{q}qx⌋. Now you need a bunch of axioms, namelyc<_{p}cwhenever_{q}p<q,c+_{p}c=_{q}c, stuff like that. This is the part that needs a dog-walk to get right. Could we want_{p+q}c=_{p}ceven though_{q}p<q? Maybe ⌊px⌋ = ⌊qx⌋ even thoughp<q? Well no: there are (standard) rational numbers sandwiched in betweenpandq, and sincexis nonstandardly large, that should preclude ⌊px⌋ = ⌊qx⌋. You construct your modelMfirst, so that you can use the ⌊qx⌋’s with rationalq’s to help decide questions about thec’s with real_{q}q’s, via rational approximations.I better get going. The dog is already waiting at the door.

Well, the dog-walk yielded five thoughts about the ⌊

qx⌋ problem with realq.First, I left out one crucial group of axioms:

c=⌊_{q}qx⌋ for rationalq. Second, I forgot to mention how we know the compactness condition is satisfied—that is, that every finite subset of the axioms has a model. The idea is to use rational approximations for all thec’s appearing in the axioms—that is, rational approximations to the_{q}q’s. Itseemsclear that with only a finite number of axioms to keep happy, this will work. But I guess that would have to be checked.Third, the axiom group

c+_{p}c=_{q}cisn’t quite right, and getting it right_{p+q}couldbe a major obstacle. When does ⌊px⌋+⌊qx⌋ =⌊(p+q)x⌋? Well, it depends on the so-called fractional parts ofpxandqx, often denoted {px} and {qx}. These are defined bypx=⌊px⌋+{px}, ditto forq. If {px}+{qx}≥1, then we wantc+_{p}c+1 =_{q}c, otherwise_{p+q}c+_{p}c=_{q}c. A famous theorem says that for irrational_{p+q}q, {qn} is distributed uniformly in the interval [0,1) asnranges from 1 to ∞. (Proved independently by Bohl, Sierpiński, and Weyl, according to Niven’sIrrational Numbers; the proof by Weyl, using Fourier analysis, is slick.) So it’s not obvious to me how to determine what the value ofc+_{p}c“should be” from the rational approximations to_{q}q. Maybe it’s a “Garden of Forking Paths” situation.Fourth, if you just want an

uncountablemodel of PA, of any given cardinality, that’s easy: just use the upwards Löwenheim-Skolem theorem.Fifth, I remembered the paper Order-types of models of Peano arithmetic: a short survey, by Bovykin and Kaye. I downloaded this a while ago, but I haven’t gotten around to reading it. (No time now, unfortunately.)

Michael wrote:

In fact, I think it does more, namely, preclude their difference being standard, thus proving they’re in “different copies of Z”.

It seems to me that, at worst, this just means adding all the axioms like “c_p + c_q is either 0 or 1” for every (p, q). (Together with whatever other axioms make these all act like you want them to.)

(I imagine this has all been worked out before, since it’s a pretty obvious question. But speculating about it, and reading both of your ideas about it, is educational.)

(Thanks for the “order types survey” reference — I’ll check it out.)

Thinking about it, I realized it’s not hard to get a model of PA with a ⌊

qx⌋ for every realqwithxnonstandard. Any countably saturated model will do. (Not recursively saturated, but countably saturated, aka ω_{1}-saturated. This means that all countable types are realized.) The type we need to realize is {⌊px⌋<y:p∈ℚ^{+},p<q}∪{y<⌊px⌋ :p∈ℚ^{+},q<p}, whereyis the free variable being realized. Herexis a parameter.Incidentally, the survey article mentions that you can’t have a model whose order type is ℕ+ℝℤ. They give two different proofs. But asking for a model with ⌊

qx⌋ for every realqis a weaker demand.(WordPress is sure messing up the comment order again!)

Michael wrote:

Why would “countably saturated” and “ω

_{1}-saturated” mean the same thing?I think I

dounderstand why you want that type… and it looks to me like a countable type (with both x and q as parameters).I noticed that too, in their Theorem 3.2 (which disproves my guess about that). I agree our demand is weaker — but if I correctly guess the meaning of their next Theorem 3.3,

it(and the subsequent discussion, after the next theorem too) says that what we’re asking for is not possible.(But I guess I only understand about 30% of what’s in that part of the paper, so I might be misinterpreting that.)

==

In Section 5 they say another related thing: assuming GCH, “Also easily proved is that the order-type of any ω

_{1}-like model is ℕ + ω_{1}ℚ ℤ.”Digressing fully now: Section 7 is both the most interesting so far, to me, and the least comprehensible (hopefully that correlation is accidental). So I only skimmed it — but this caught my eye:

I would sure like to know more about this “unique random graph”!

John wrote:

I think you can get an example of that, if there is any situation where PA proves “at most a finite number of x’s satisfy P(x)”, but is not able to prove exactly how many (but the actual number is at least 1, whether or not PA can prove this). (I think I have heard about proofs of theorems like that, but I am not confident they are really like that — that is, that they don’t also come with upper bounds on the largest such x, which are “computable” in whatever sense proves to us they are standard.)

Then in the standard model of PA, suppose k x’s satisfy P(x).

Now extend PA with the axiom “at least (k + 1) x’s satisfy P(x)” and pick a model M of that.

Also define c_P = “the largest x that satisfies P(x)”.

And finally, define f(n) as “n times c_P”.

Then in M, f(n) is nonstandard for n > 0, but in the standard model of PA, f(n) is of course always standard. (And in PA, this f is provably total, increasing, etc.)

(This does not finish the story by defining a related g(n), but I guess that part would not be hard.)

(For example, I think g(n) = 2^n will work!)

(I mean, that works if we switch the order, so that we want g(n) > f(n) eventually.)

BTW, I think we could generalize this to work even if PA can’t prove “the number of such x’s is finite”, provided the number

isfinite in the standard model. Assume that number is k. Then define c_P as “the kth such x if there are only k, and the (k+1)st such x otherwise.”On reflection, I’m not really sure what the question was. It

seemedto be this:In other words, PA proves ∃

m∀n(n>m→f(n) >g(n)). Now, PA is generally assumed to be “sound”, meaning that if PA proves something, it’s true in the “standard model” ℕ. What would it mean to say that PA was unsound? Unclear. Certainly you can prove in ZF that PA is sound. So ℕ satisfies the formula, which means that themis standard.Michael wrote:

Yes, but in the example I gave, the definition of f changes meaning when you go into a different model, so in

thatmodel (I mean, some models), no standard m works anymore.(BTW, your initial idea of using PA + ¬Con(PA) can work too — just define c_P as “the Gödel number of the first proof of 0=1, or 1 if there is no such proof.”)

Above I expressed curiosity about this, from the survey:

Having just now read up on the Rado Graph, and from the context in the survey of that excerpt, I’m

guessingthat what they are saying is “if you construct the Rado graph inside a model of PA, you get something unique”. I don’t know if, by “unique”, they just mean “characteristic of that model”, or “different for non-isomorphic models”.Bruce wrote:

Umm. I think the trick is to reply only to the original post (as John did in his latest comment), and not to other comments. That way the “comment tree” stays long and skinny, rather than bushy. I’m not up for another conversion effort!

Bruce wrote:

It’s a cool entity. Here’s a really clear introduction to it—after a couple paragraphs you’ll get the basic idea:

The n-Category Café, November 6, 2009.This comment is a miscellany.

First, Bruce asked why ω

_{1}-saturated means countably saturated. In general, ‘κ-saturated’, for a cardinal κ, means that all types with <κ formulas are realized. This is more convenient, for a couple of reasons, than defining it to mean “all types with ≤κ formulas are realized”.Second, my interpretation of the “⌊

qx⌋ for every realq” question is, does there exist a modelNwith an order-preserving injection ℝ→Nsuch that distinctq’s end up in distinct ℤ-blocks? I don’t see how the theorems in §3 of the Bovykin-Kaye survey would preclude that.Third, I only skimmed §7 of the paper, but it looks like a nice illustration of the motto, “Every non-standard model of PA thinks it’s standard.” (Hmm, maybe like “Everyone is the hero of their own story”?)

Fourth, to a model-theorist, the theory of random graphs has a binary predicate standing for the edges (i.e.,

xRywhenxandyhave an edge between them), and axioms saying that for any disjoint finite setsAandBof vertices, there is a vertexxwith edges to all of the vertices inAand with no edges to any of the vertices inB. Specifically, we need an axiom for every pair of numbersm,n, saying this for setsAwithmelements andBwithnelements. (The domain of a model is all the vertices, so variables implicitly range over that—no need for a predicate ‘is-a-vertex’.)Using a back-and-forth argument, you can prove that all infinite countable random graphs are isomorphic. In this respect, the theory is like DLO, the theory of dense linear orders without endpoints. Such theories are said to be ω-categorical, a term that has nothing to do with category theory. Or maybe it does—John? (

Historically, the terms were introduced independently of each other.)Michael wrote:

No, there’s no useful mathematical connection. I guess the first influential person to have “categories” was Aristotle:

Mac Lane and Eilenberg say they grabbed the word “categories” knowing about this philosophical baggage but not really caring about it, just as they took Carnap’s word “functor”.

Bruce wrote:

Just the former. Forget “models of PA” for a minute; I think the point is that the various characterizations of the Rado graph don’t

obviouslycharacterize it uniquely up to isomorphism… and yet they do!For example, it’s a countable graph such that for every two disjoint finite sets of vertices U and V, there exists a vertex x outside both sets that is connected to all vertices in U, but has no neighbors in V. It’s not obvious that this characterizes a graph uniquely up to isomorphism… but it does! So we exclaim that this graph is unique.

Or, if you construct it by flipping a probability-p coin to decide whether to put an edge between each pair of vertices, it’s not obvious that with probability 1 you always get the same graph up to isomorphism, and that this graph doesn’t depend on p, as long as 0 < p < 1. But it’s true!

Now, maybe we can prove these things in PA, and then in any model of PA the Rado graph is unique

in that model. I haven’t at all thought about comparing how it looks in different models.Thanks for all your replies, Michael and John.

(Some of those discussion threads, I will not pursue further for now.)

About the Rado graph, I really liked this 2013 survey by Peter J. Cameron:

https://arxiv.org/abs/1301.7544

It proved all the things you both said about it, and much more, including that every countable model of every reasonable set theory has exactly this graph as its membership relation (once you forget the directionality in that relation).

But this doesn’t yet tell me what Bovykin and Kaye could have meant when they said in Section 7

“… there is a unique random graph … interpreted in a given model of PA.”

since if all they meant was

therandom graph (regardless of model, whenever the model is countable), it would seem to me to be too trivial a point to mention.(Anyway, short of asking them what they meant, I don’t know how to pursue it.)

Nice article. Cameron is a good writer.

I think this is what Bovykin and Kaye mean. First note what they say about Theorem 7.4: it “is the internalization of the fact ‘

every countable dense linear order without end-points is isomorphic toℚ’”. Then they say that “Versions of Theorem 7.4 exist for other ω-categorical theories.” The theory of the random graph is ω-categorical because the countable random graph is unique (up to isomorphism). Now, because this is aninternalizationof that fact, it means actually that the PA-modelMthinks this. So viewed “from outside”, the graph actually has cardinality |M|.You (Bruce Smith) asked:

I’ll modify the question just a bit. Suppose PA ⊢ ∃

x∀y>xφ(y). Give an example of a nonstandard model where the first suchxis nonstandard.If you have a predicate φ(

y) with this property, then letg(x) be identically 0, andf(x) be the characteristic function of φ. That gives you two functions satisfying your demand, except thatf(x) might not be increasing. To fix that, letf(x) be the characteristic function of ∃y≤xφ(y). If you wantfandgto bestrictlyincreasing, just addxto both of them.OK. Let Proves(

p,x) be the formalization of “pis a proof in PA ofx“. (That is,pis the Gödel number yada-yada-yada.) Let φ(y) be ∃p<yProves(p,⌜1=0⌝). LetNbe a model of PA+¬Con(PA), which is consistent by the 2nd incompleteness theorem. So inN, ∃x∀y>xφ(y), namely apsuch that Proves(p,⌜1=0⌝). Naturally thispmust be nonstandard.Problem: we don’t have PA ⊢ ∃

x∀y>xφ(y). Easily fixed: let φ(y) be¬∃Proves(

pp,⌜1=0⌝) ∨ ∃p<yProves(p,⌜1=0⌝)In words, either PA is consistent, or there’s a proof of a contradiction smaller than

y. ∃x∀y>xφ(y) says that φ(y) is true for all sufficiently largey. If Con(PA), then this is true for ally; if ¬Con(PA), then it’s true for anyygreater than the smallest proof of 1=0.Nice!

(Do you agree that the various examples I gave are also correct?)

Yes. I saw your first comment, with the hypothetical P(x) with k x’s satisfying P(x), but somehow missed the second one, where you suggested defining c_P to be the smallest proof of 1=0, or zero if Con(PA). We’re on the same page.