# Category Archives: Logic

## Non-standard Models of Arithmetic 22

Prev TOC Next

MW: So we have our setup: BMN, with N a model of PA, B a set of “diagonal indiscernibles” (whatever those are) in N, and M the downward closure of B in N. So B is cofinal in M, and M is an initial segment of N. I think we’re not going to go over the proof line by line; instead, we’ll zero in on interesting aspects. Where do you want to start?

BS: Let’s start with how we’ll end up proving M is a model of PA. As we do that, maybe you’ll need to bring in more about B, and/or about using the PHP in N—but since my “real” interest is more general, it would be fine if it turns out the technique only depends on a few aspects of those!

MW: Sounds good! Well, there are three parts to showing that M is a model of PA. (1) M is closed under + and ·. (2) M satisfies all the axioms other than the induction axioms. (3) M satisfies the induction axioms.

Parts (1) and (3) use the properties of diagonal indiscernibles. Part (2) turns out to be the easiest: it uses just the fact that M is an initial segment closed under + and ·. Incidentally, an initial segment  closed under successor is called a cut, but there doesn’t seem to be a standard term for one closed under + and ·. A supercut?

Part (1) uses the “growth properties” of B. Part (3) uses that transfer principle I talked about last time, without explaining what it was. And Part (2) uses a simpler transfer principle.

It might make sense to start with Part (2), just assuming Part (1). It’s a nice warm-up.

BS: Okwhatever order you think is best. But it seems to me that given (1) and that M is an initial segment containing 1, (2) is trivial. Am I missing something?

MW: Nope, it is pretty easy. A warm-up, like I said. I just posted the PA axioms (two equivalent versions, in fact). If you peruse them, you’ll see that apart from the induction axioms, all but one are ∀1 formulas: a block of universal quantifiers followed by a quantifier-free predicate. ∀1 formulas are persistent downwards. So (aside from that exceptional axiom) that takes care of Part (2) in one fell swoop!

Nothing specific to PA here—this is a general fact of model theory. If A is a substructure of B, and $\varphi(\vec{x})$ is an ∀1 formula, and $B\models\varphi(\vec{c})$ for a list of elements $\vec{c}\in A$, then $A\models\varphi(\vec{c})$. “As above, so below.” (Note that we must demand $\vec{c}\in A$, otherwise $A\models\varphi(\vec{c})$ wouldn’t make sense.)

Our first transfer principle, nice and easy. We need Part (1), by the way, because without it we don’t know that M is a substructure of N.

BS: In hindsight, it’s well worth seeing that spelled out—it’s straightforward, but not as trivial as I thought. I do see how to take care of the “holdout”:

xy  ( x<y ↔ ∃w x+(w+1)=y )

We have to prove that if x + w + 1 = y, then w < y, to make sure w is in M. As easy as that is, I hadn’t explicitly noticed the need to prove it, until you pointed out how that axiom differs from the others.

MW: Right. The trick is to break it into two axioms:

xy  ( x<y → ∃w x+(w+1)=y )

xy  ( ∃w x+(w+1)=y x<y )

Because PQ is the same as ¬PQ, the second half is also1. For the first half, we have the key fact that you noted: for x,yM, the w given by the axiom is less than y, so it also belongs to M.

So another transfer principle handles the axiom: Π1 downwards persistence. Namely: say we have two structures for L(PA), A and B, with AB. Suppose also that A is an initial segment of B. Then Δ0 formulas are absolute between A and B: persistent both upwards and downwards. And Π1 formulas are persistent downwards. The initial segment requirement has paid off by making more formulas persistent.

Now, “∀xy(x<y → ∃w x+(w+1)=y)” is ∀2, aka ∀∃. We can’t apply any of these transfer principles to it. But it’s implied by this formula: “∀xy(x<y → ∃w<y x+(w+1)=y)”. That’s a Π1 formula. It holds in N, so it holds in M.

BS: That’s interesting! I guess it is worth reformulating as many axioms as possible into Π1 form, to make it clearer that they transfer this way.

MW: More generally, this shows the insights afforded by knowing where formulas land in the arithmetic hierarchy.

OK, let’s backtrack to Part (1)showing that M is closed under + and ·. A couple of observations first. Suppose r1<r2<…<rn<… is an infinite increasing sequence of elements of N, and M is the downward closure of this sequence. If rn+1≥2rn for all n, then M is closed under addition. If rn+1rn2 for all n, then M is closed under multiplication.

BS: Because everything in M has to be under some rn! Clever!

MW: Now we need the definition of a set of diagonal indiscernibles in the model N for a set Φ of formulas. Here’s what I said in post 9:

For diagonal indiscernibles, the truth-value of φ(r1,…,rn,c1,…,ch) is the same no matter what the ri’s, provided that {r1,…,rn} is a subset of B, that r1<…<rn, and that all the ci’s are less than an indiscernible b which is less than all the ri’s. (Think of b as a kind of barrier—or DMZ.) And of course that φ belongs to the given set Φ of formulas—Δ0 formulas in this case.

BS: I’m not sure I understand that correctly—is the following reformulation equivalent?

Given φ and c1,…,ch, let b be minimal in B such that c1,…,ch<b. Then for every r1<…<rn all greater than b, φ(r1,…,rn,c1,…,ch) has the same truth value. (I.e., given those conditions, its truth value depends only on φ and c1,…,ch.)

MW: We shouldn’t demand that the barrier be minimal. It’s enough that it separates the parameters from the other indiscernibles. Let me try another phrasing. I’ll also change notation just a bit, for later convenience.

BN is a set of diagonal indiscernibles for Φ in N, if the following holds. For every formula φ(x1,…,xn,y1,…,yh) in Φ and for every tuple of elements c1,…,ch in N and for every b,b1,…,bn,e1,…,enB, if

c1,…,ch<b<b1<…<bn  and  c1,…,ch<b<e1<…<en

then

N⊧φ(b1,…,bn,c1,…,ch)    ⇔    N⊧φ(e1,…,en,c1,…,ch).

I call b the barrier, and everyone calls the ci’s the parameters.

The numbers n and h will depend on the formula φ. Notice that the bi’s and the ei’s must be in increasing order, but the ci’s don’t have to be. As for the order of bi’s and ei’s with respect to each other, nothing is assumed about that.

BS: Ok, that’s clear. How does the barrier end up helping us?

MW: The key fact is that the barrier must belong to B; it turns out that any increasing sequence of diagonal indiscernibles grows at a very rapid clip. So having an element of B in between the ci’s on one side, and the bi’s and the ei’s on the other, means the two sides must be separated by a pretty big gap! If I may speak metaphorically, the bi’s and the ei’s are so far over the horizon of the ci’s, that the ci’s can’t distinguish between them—they’re indiscernible to the ci’s!

The role of the barrier may become clearer when we go through the proof that rn+1≥2rn and rn+1rn2 for all n.

BS: Ok, I’m looking forward to that! (I might even think I can guess something about it, but I’ll wait and see.)

Prev TOC Next

Filed under Conversations, Peano Arithmetic

## Topics in Nonstandard Arithmetic 7: Truth (Part 3)

Prev TOC Next

Previous “Truth” post

Last time we looked at Tarski’s inductive definition of truth formalized inside ZF set theory. Recall the setup: a first-order language L and a structure S for L. The domain of S is a set (not a proper class), which means that the relations and functions on the domain are too. The fruit of the effort was a formula in L(ZF), True(L,S,φ), which expresses “S is a set-based structure for L and φ is a sentence of L and S ⊧ φ”. (With minor changes, we get a parametrized version, True(L,S,φ,u). Here u is a finite list of elements of the domain of S to be assigned to the free variables in φ.)

In the first post in this series, we looked at {Trued}, an infinite sequence of formulas in L(PA); Trued expresses truth for formulas with parse-depth at most d. In this post, we’ll look at SatΣn, satisfaction for Σn formulas.

SatΣn is the “professional-grade” version of Trued; it’s what you’ll find in the textbooks in the bibliography, like Kaye. (I suppose “satisfaction” sounds more professional than “truth”.) Like Trued, it’s an infinite sequence of formulas. Like Trued, it limits its ambitions by complexity: parse-tree depth for Trued, and the number of quantifier alternations for SatΣn. So (for example) the axiom ∀xyz((z=0 ∨ ¬x<y) ∨ x·z<y·z) has depth 6, but belongs to Π1. (SatΠn amounts to the negation of SatΣn, since a Πn sentence is true iff its negation, a Σn sentence, is false.)

The main effort in defining SatΣn takes place at the ground level, defining SatΔ0. Surprising! And this definition shares DNA with True(L,S,φ) from ZF. Here are some analogies to conjure with:

PA : infinite sets  ::  ZF : proper classes
PA : finite sets  ::  ZF : sets.

In ZF, quantification over a set is a “bounded search”, in some sense. In PA, the same holds for quantification over a finite range.

The construction of True(L,S,φ) rested on the concept of an instantiation tree. We can do nearly the same thing for Δ0 sentences. Here’s an example, for the sentence ∃x<3 ∀y<x ψ(x,y):

I’ve made a different graphic choice from the diagram for the ZF case. Instead of regarding the instantiations “∀y<0 ψ(0,y)”, “∀y<1 ψ(1,y)”, and “∀y<2 ψ(2,y)” as all “living in the same node”, I’ve split them into separate nodes. This reflects a more fundamental difference: in the ZF case, all the quantifications are “bounded” by the same set, the domain of S. Here the bound varies from node to node.

The instantiation tree must be grown from the root down. Our Δ0 formula is a sentence, i.e., no free variables. Of course subformulas need not be closed. Any outermost quantifier must have a fixed bound—that is, a closed term for the bound, which can be evaluated. The associated quantifier node will have a bounded list of children, in which the quantified variable has acquired a fixed value. The example above is particularly simple, but something like

x<10 (¬ξ(x) ∨ ∃y<3x w<x+y (ζ(x,y,w)∧η(x,y,w)) )

goes much the same way. We’ll have 10 nodes under root, each a disjunction. The one with x=5 will have a right hand child ∃y<15… . One of those children has y=7, and is labeled ∀w<12 (i.e., x+y=5+7). Its 12 children are all conjunctions.

Truth evaluations proceed bottom up, of course, using the obvious rules. Now, can we code all this into PA? The diagrams look pretty when the bounds are all elements of ℕ, but over a nonstandard model the bounds can be nonstandard. That’s OK, provided we can code all this into a formula of PA. (And by ‘formula’, I mean an ordinary vanilla formula, not one of “nonstandard length” or anything like that.)

The short answer is, sure, no problem! PA is “basically the same as” ZF¬∞. The instantiation tree and its truth evaluation “look like” finite combinatorial objects, so of course PA can encode them. The “local behavior” aspect of the ZF case—a thousand quantifiers in the formula don’t bleed through to give a thousand quantifiers in True(…)—applies here too.

For the long answer, consult Kaye, Chapter 9. He begins by saying

This is the chapter that no one wanted to have to write: the material here is technical and difficult to describe clearly, yet it is very necessary for later work. I have put it off as long as possible, but cannot do so any longer.

Nineteen pages later, he gives the definition of SatΔ0. A few pages more establish its chief properties. Then in a quarter of a page, he defines the sequences SatΣn and SatΠn, building on top of SatΔ0. It’s much like Trued: a syntactic ∃x becomes semantic. Life is just a touch more complicated, because SatΣn must handle a block of existential quantifiers prepended to a Πn−1, formula; likewise for SatΠn. But PA has mastered lists, so this is barely a hiccup.

Just a few more remarks. If you do decide to read Kaye’s Chapter 9, be aware he does not explicitly use parse-trees. Instead, he ‘parses’ the Gödel numbers that represent strings of symbols. You will also find data structures that you can construe as instantiation trees, although he does not make this explicit. I confess I have not had the patience to go through his treatment line by line. My sympathy for the author’s pains!

As a reward for the detailed development, Kaye can show that SatΔ0 is a Δ1(PA) formula, and SatΣn and SatΠn are Σn(PA) and Πn(PA) formulas (respectively) for n>0. (The quantifiers “bleed through”.) A diagonal argument now shows that the Σnn hierarchy is a true hierarchy: for any model N of PA, there is a relation on N that is Σn(N) but not Πn(N), and vice versa.

As with Trued, PA proves the expected equivalence for any sentence φ with the right complexity. That is, if φ belongs to Σn(PA), then PA⊢φ↔SatΣn(⌜φ⌝). Likewise for the parametrized version. Note that this is a collection of proofs, one for each formula φ.

That’s all for now!

Previous “Truth” post

Prev TOC Next

Filed under Peano Arithmetic

## Topics in Nonstandard Arithmetic 6: The Axioms

This is a “reference” post. With all the posts already filed under Peano Arithmetic, I realize I never explicitly stated the axioms. Of course you can find them on Wikipedia and at a large (but finite) number of other places, but I thought I should put them down somewhere on this site.

Filed under Peano Arithmetic

## Non-standard Models of Arithmetic 21

Prev TOC Next

Bruce Smith joins the conversation, returning to a previous topic: the Paris-Harrington theorem. (Discussion of the Enayat paper will resume soon.)

BS: Post 8 and post 9 discussed the Paris-Harrington theorem, or PHT. I have some questions about “how that proof really works”. But my main motivation is more general—what are the various ways in which people know how to construct nonstandard models of PA, with various properties of interest? And what other kinds of statements might someday be proved unprovable, using those methods?

I’m especially interested in anything analogous to “forcing” (which Paul Cohen invented to prove CH unprovable in ZF). The PHT’s method involves finding a submodel, so it’s not directly analogous to forcing—but we have to start somewhere, and its proof certainly seems interesting and educational. So we can reasonably limit the scope of this post to “how the Paris-Harrington Theorem really works”.

MW: Sounds good. I glossed over several issues in post 9, so I’m happy to revisit it. But I’m glad you give me an opportunity to plug my Logic and Smullyan notes. Section 11 of the logic notes discusses forcing in its simplest context, PA and recursion theory. Section 21 of the Smullyan notes deals with forcing for set theory, and covers Cohen’s classic results.

All that said, the proof of the Paris-Harrington Theorem isn’t related to forcing, as far as I can tell. (Although you never know.) It comes out of a different line of thought in logic, the method of indiscernibles.

The book by Kossak & Schmerl has a chapter on forcing methods applied to models of PA, but I haven’t read it yet.

BS: OK—here is how I understand it so far. (And, thanks for the references!)

The PHT states that a certain combinatorial claim, known as the Paris-Harrington Principle (PHP), is not provable in PA (if PA is consistent, which we’ll hereafter assume without stating).

Roughly, its proof goes like this: if the PHP holds in a nonstandard model N of PA, it can be used (partially from “within N”) to construct a submodel M of N, where M is also a model of PA, but one in which the PHP doesn’t hold! But if the PHP was provable in PA, it would hold in all models, including all nonstandard models (of which we know there is at least one, since PA is consistent and has independent sentences). QED!

(That argument does not rule out that the PHP is refutable in PA, thus holding in no models of PA — but that can be disproven in other ways, since ZF can prove the PHP holds in the standard model of PA.)

The proof of PHT involves using the PHP (from “inside N”?) to find a set B of “indiscernibles” in N, and then (I think from “outside N”) defining the subset M of N as their “downward closure. Then from “outside N”, it proves M is a model of PA, and that the PHP doesn’t hold in M.

MW: So far so good. Just one thing: finding the set B of indiscernibles uses a mix of “inside N” and “outside N” reasoning, typical of the overspill principle. But we’ll get to that. Carry on!

BS: Ok—that is exactly the sort of subtlety that I hope to understand better!

To continue—part of how that works (proving that M is a model of PA) is that, in N, there is a “finite” c, which we (outside N) understand is actually larger than every standard number. In particular that means we can have a “finite” set of formulas—finite according to N, that is—which is really infinite. Somehow you can use the PHP (which applies to finite sets), from “inside N”, to tell you something about that infinite set of formulas.

MW: That’s right. The thing that it tells you is the so-called diagonal indiscernibility with respect to that set of formulas. We can get into the details later. By the way, the set of formulas does not include all the PA axioms—the argument is more subtle than that.

BS: In general, it seems this theorem carries out important reasoning both inside and outside N—untangling this is part of the question of how it works. For example, I think there is no way N can agree with us that “my subset M is a model of PA”, since that would mean it had a proof (that is, had an element which it could verify was the Gödel number of a proof) that PA has a model, and therefore is consistent. In fact, I am guessing the sets B and M can’t even be defined from inside N.

MW: Right! So far as N is concerned, N is an initial segment of every model of PA. Just like ℕ “really” is. (The scare-quotes because of all the semi-philosophical issues churning around that adverb.)

I wouldn’t emphasize the proof aspect, though. N can “believe” things—in other words, it can satisfy things—without necessarily having an “N-proof” of them. In much the same way, ℕ satisfies Con(PA), even though there is no proof of Con(PA) inside PA.

BS: Let me unpack that, and you can tell me if I have it right. Con(PA) means “there is no PA-proof of 0=1”, so “ℕ satisfies Con(PA)” means “there is no ℕ-element which encodes (as computed in ℕ) a PA-proof of 0=1”. This is stronger than “ℕ satisfies PA”, which just says “each axiom of PA holds in ℕ”.

MW: Exactly right. You’re also right about B: that can’t be defined inside N. To state it a little more precisely, there is no formula β(x) in L(PA) (the language of Peano Arithmetic) such that bB if and only if N⊧β(b). If there were such a formula, then we’d have a formula for the downward closure: μ(x) ≡ ∃y(β(y) ∧ x≤y).

BS: Thanks—that makes things much clearer. I generally read “N ⊧ φ” as “N believes φ”—I think I have seen this referred to as both “N believes φ” and “N thinks φ” in papers, as well as “N satisfies φ” and perhaps “N models φ”—but then I sometimes get confused and think “N believes φ” is saying “N has a proof of φ”. But (as we just discussed) that is a much stronger statement. “N ⊢ φ”, i.e. “N proves φ”, means “N has an element which it believes is the Gödel number of a proof of φ” (using some theory that is clear from the context). Maybe it would reduce my confusion to stick to reading “⊧” as “satisfies”.

MW: “Satisfies” is the usual textbook term. People use “believes” and “thinks” to make everything more anthropomorphic. That helps them—helps me—think about the math. I also like to talk about things that are “invisible when you’re wearing N glasses”, like the set B. The N people can see the individual elements of B, but not the set B as a whole. Of course, this is all just blog-speak for the stuffier (but perhaps clearer) language of the textbooks.

(My advisor once suggested that the entire mathematical lexicon exists only as an aid for our weak minds. Super-mathematicians would just have a single list, Definition 1, Definition 2, …. Definition 304726, … and likewise for all mathematical theorems.)

I wouldn’t write “N ⊢ φ”. N is a model. Models satisfy statements, theories prove them. So we can write “PA ⊢ φ”. Now, if you want to say, “In the model N, there is a proof, perhaps of nonstandard length, of φ from the axioms of PA”, you could write “N ⊧ (PA ⊢ φ)”. Here “(PA ⊢ φ)” is what I call ‘vernacular’: not actually an expression in the formal language, but intended as a shorthand for a formal expression. Or I might put PA ⊢ φ in quotes, to indicate it’s vernacular.

BS: Ok, I will try to follow those rules—that does clarify things.

By the way, I just noticed that you wrote N⊧β(b) even though b might be nonstandard. Does that mean we are talking about N satisfying a nonstandard formula?

MW: No—in that formula, b is a name, so β(b) is a standard formula even if the model element named by b is nonstandard.

BS: What are “names”?

MW: New constants. Suppose we have a structure A for a language L. So L has relation symbols, maybe also function symbols and constants. A has corresponding relations, functions, and elements. We don’t necessarily have a constant for every element of A. A standard trick in model theory is to expand L by adding a new constant for every element of A. Many people (including me) use LA to denote the expanded language, and call the new constants names. (I say a little bit more about them in this Topics post.)

BS: Thanks—these details do help!

Back to the PHT—my biggest question was about how we “really” prove M is a model of PA, since the stated method in Post 9 sounded like it was trying to do more than ought to be needed—it seemed to want M to be “similar to N” in some sense.

MW: Good place to start. Are you familiar with the concepts of elementary equivalence and elementary submodel?

Let me give a précis anyway. Suppose you have a first-order language L and two structures A and B for this language. A and B are elementarily equivalent if they satisfy exactly the same first-order sentences: A ⊧ φ iff B ⊧ φ for any closed formula φ of L. If A is a substructure of B, then A is an elementary substructure of B if for any first order formula $\varphi(\vec{x})$ in L and any $\vec{a}$ in A, then $A\models\varphi(\vec{a})$ iff $B\models\varphi(\vec{a})$. (If A and B are both models of a theory T in the language L, then we say ‘elementary submodel’ instead.) By the way, I use the notation $\vec{a}$ as shorthand for a1,…,an. The ai’s are names.

‘Elementary substructure’ is stronger than ‘elementary equivalence’, since it allows for names of elements of A in the formulas.

BS: So do I understand correctly that ‘elementary’ basically means ‘every formula is absolute’ (when comparing the two structures involved)?

MW: Yes, that’s right. Absolute between the two structures. For elementary equivalence, this absoluteness holds for all closed formulas of L. For elementary substructure, for all closed formulas of the expanded language LA.

BS: It feels like this post is all “preliminary” so far—though it is certainly a necessary and helpful discussion (for me anyway).

MW: Well, I think we’ve gotten a little ways in. We have the setup: N is a nonstandard model of PA, B is a subset of “diagonal indiscernibles” of N (whatever those are), M is the downward closure of B. We know that N satisfies the PHP, by hypothesis. We know the goal of the proof: to show that M is a model of PA, and that M does not satisfy the PHP.

I brought up elementary equivalence for the following reason. If we could show that M was elementarily equivalent to N, then it would follow immediately that M was a model of PA. That’s how I thought it went for a moment, the first time I read the proof. But that can’t be right, because PHP holds in N and not in M!

I think you were getting at this when you wrote, “it seemed to want M to be ‘similar to N’ in some sense”. That is an aspect of the proof: the authors demonstrate a “transfer principle” that transfers certain statements back and forth between M and N. This transfer principle is then used to show that M is a model of PA. But the principle is more subtle than elementary equivalence, or elementary substructure.

BS: Ok—I am eager to see where all this goes, in the next post!

Prev TOC Next

Filed under Conversations, Peano Arithmetic

## Topics in Nonstandard Arithmetic 5: Truth (Part 2)

Next “Truth” post

Last time we looked at Tarski’s inductive definition of truth, expressed informally. We saw how for models of PA, it can be formalized as an infinite sequence of formulas True0, True1, …, formulas belonging to L(PA) itself. But not as a single formula in L(PA).

Filed under Peano Arithmetic

## Topics in Nonstandard Arithmetic 4: Truth (Part 1)

In post 15 of the Conversation, I observed:

• Gödel’s two most famous results are the completeness theorem and the incompleteness theorem.
• Tarski’s two most famous results are the undefinability of truth and the definition of truth.

The second bullet has occupied its share of pixels in the Conversation. Time for a summing up.

Filed under Peano Arithmetic

## Non-standard Models of Arithmetic 20

Trudy Campbell

MW: OK, let’s recap the setup: we have a three-decker ωUUV. So far as U is concerned, ωU is the “real, true omega”. V knows it isn’t. Enayat’s question: what properties must an omega have, for it to be the omega of a model of T? Here T is a recursively axiomatizable extension of ZF, and U is a model of it.

1 Comment

Filed under Conversations, Peano Arithmetic

## Topics in Nonstandard Arithmetic 3: The Arithmetic Hierarchy (Part 2)

Last time I defined ∃n and ∀n prefixes and formulas; Σn, Πn , and Δn relations (and functions) on ℕ; Σn(PA), Πn(PA), and Δn(PA) formulas in L(PA); and Σn(N), Πn(N), and Δn(N) relations (and functions) on a model N of PA. I won’t repeat all that, but a few bullet points may help load it into working memory:

Filed under Peano Arithmetic

## Topics in Nonstandard Arithmetic 2: The Arithmetic Hierarchy (Part 1)

Filed under Peano Arithmetic

## Nonstandard Arithmetic: A Long Comment Thread

Posts 7 and 8 developed an extensive comment thread, mainly between Bruce Smith and John Baez. It was hard to follow in that format, so I converted it to a separate webpage.

Topics: (a) Why do standard models of ZF have standard ω’s? (b) Interactions between the Infinity Axiom and the Foundation Axiom (aka Regularity). (c) The compactness theorem. (d) The correspondence between PA and “ZF with infinity negated”: nonstandard numbers vs. ill-founded sets, and the Kaye-Wong paper (cited in post 8).