Monthly Archives: October 2020

Non-standard Models of Arithmetic 21

Prev TOC Next

Bruce Smith joins the conversation, returning to a previous topic: the Paris-Harrington theorem. (Discussion of the Enayat paper will resume soon.)

BS: Post 8 and post 9 discussed the Paris-Harrington theorem, or PHT. I have some questions about “how that proof really works”. But my main motivation is more general—what are the various ways in which people know how to construct nonstandard models of PA, with various properties of interest? And what other kinds of statements might someday be proved unprovable, using those methods?

I’m especially interested in anything analogous to “forcing” (which Paul Cohen invented to prove CH unprovable in ZF). The PHT’s method involves finding a submodel, so it’s not directly analogous to forcing—but we have to start somewhere, and its proof certainly seems interesting and educational. So we can reasonably limit the scope of this post to “how the Paris-Harrington Theorem really works”.

MW: Sounds good. I glossed over several issues in post 9, so I’m happy to revisit it. But I’m glad you give me an opportunity to plug my Logic and Smullyan notes. Section 11 of the logic notes discusses forcing in its simplest context, PA and recursion theory. Section 21 of the Smullyan notes deals with forcing for set theory, and covers Cohen’s classic results.

All that said, the proof of the Paris-Harrington Theorem isn’t related to forcing, as far as I can tell. (Although you never know.) It comes out of a different line of thought in logic, the method of indiscernibles.

The book by Kossak & Schmerl has a chapter on forcing methods applied to models of PA, but I haven’t read it yet.

BS: OK—here is how I understand it so far. (And, thanks for the references!)

The PHT states that a certain combinatorial claim, known as the Paris-Harrington Principle (PHP), is not provable in PA (if PA is consistent, which we’ll hereafter assume without stating).

Roughly, its proof goes like this: if the PHP holds in a nonstandard model N of PA, it can be used (partially from “within N”) to construct a submodel M of N, where M is also a model of PA, but one in which the PHP doesn’t hold! But if the PHP was provable in PA, it would hold in all models, including all nonstandard models (of which we know there is at least one, since PA is consistent and has independent sentences). QED!

(That argument does not rule out that the PHP is refutable in PA, thus holding in no models of PA — but that can be disproven in other ways, since ZF can prove the PHP holds in the standard model of PA.)

The proof of PHT involves using the PHP (from “inside N”?) to find a set B of “indiscernibles” in N, and then (I think from “outside N”) defining the subset M of N as their “downward closure. Then from “outside N”, it proves M is a model of PA, and that the PHP doesn’t hold in M.

MW: So far so good. Just one thing: finding the set B of indiscernibles uses a mix of “inside N” and “outside N” reasoning, typical of the overspill principle. But we’ll get to that. Carry on!

BS: Ok—that is exactly the sort of subtlety that I hope to understand better!

To continue—part of how that works (proving that M is a model of PA) is that, in N, there is a “finite” c, which we (outside N) understand is actually larger than every standard number. In particular that means we can have a “finite” set of formulas—finite according to N, that is—which is really infinite. Somehow you can use the PHP (which applies to finite sets), from “inside N”, to tell you something about that infinite set of formulas.

MW: That’s right. The thing that it tells you is the so-called diagonal indiscernibility with respect to that set of formulas. We can get into the details later. By the way, the set of formulas does not include all the PA axioms—the argument is more subtle than that.

BS: In general, it seems this theorem carries out important reasoning both inside and outside N—untangling this is part of the question of how it works. For example, I think there is no way N can agree with us that “my subset M is a model of PA”, since that would mean it had a proof (that is, had an element which it could verify was the Gödel number of a proof) that PA has a model, and therefore is consistent. In fact, I am guessing the sets B and M can’t even be defined from inside N.

MW: Right! So far as N is concerned, N is an initial segment of every model of PA. Just like ℕ “really” is. (The scare-quotes because of all the semi-philosophical issues churning around that adverb.)

I wouldn’t emphasize the proof aspect, though. N can “believe” things—in other words, it can satisfy things—without necessarily having an “N-proof” of them. In much the same way, ℕ satisfies Con(PA), even though there is no proof of Con(PA) inside PA.

BS: Let me unpack that, and you can tell me if I have it right. Con(PA) means “there is no PA-proof of 0=1”, so “ℕ satisfies Con(PA)” means “there is no ℕ-element which encodes (as computed in ℕ) a PA-proof of 0=1”. This is stronger than “ℕ satisfies PA”, which just says “each axiom of PA holds in ℕ”.

MW: Exactly right. You’re also right about B: that can’t be defined inside N. To state it a little more precisely, there is no formula β(x) in L(PA) (the language of Peano Arithmetic) such that bB if and only if N⊧β(b). If there were such a formula, then we’d have a formula for the downward closure: μ(x) ≡ ∃y(β(y) ∧ x≤y).

BS: Thanks—that makes things much clearer. I generally read “N ⊧ φ” as “N believes φ”—I think I have seen this referred to as both “N believes φ” and “N thinks φ” in papers, as well as “N satisfies φ” and perhaps “N models φ”—but then I sometimes get confused and think “N believes φ” is saying “N has a proof of φ”. But (as we just discussed) that is a much stronger statement. “N ⊢ φ”, i.e. “N proves φ”, means “N has an element which it believes is the Gödel number of a proof of φ” (using some theory that is clear from the context). Maybe it would reduce my confusion to stick to reading “⊧” as “satisfies”.

MW: “Satisfies” is the usual textbook term. People use “believes” and “thinks” to make everything more anthropomorphic. That helps them—helps me—think about the math. I also like to talk about things that are “invisible when you’re wearing N glasses”, like the set B. The N people can see the individual elements of B, but not the set B as a whole. Of course, this is all just blog-speak for the stuffier (but perhaps clearer) language of the textbooks.

(My advisor once suggested that the entire mathematical lexicon exists only as an aid for our weak minds. Super-mathematicians would just have a single list, Definition 1, Definition 2, …. Definition 304726, … and likewise for all mathematical theorems.)

I wouldn’t write “N ⊢ φ”. N is a model. Models satisfy statements, theories prove them. So we can write “PA ⊢ φ”. Now, if you want to say, “In the model N, there is a proof, perhaps of nonstandard length, of φ from the axioms of PA”, you could write “N ⊧ (PA ⊢ φ)”. Here “(PA ⊢ φ)” is what I call ‘vernacular’: not actually an expression in the formal language, but intended as a shorthand for a formal expression. Or I might put PA ⊢ φ in quotes, to indicate it’s vernacular.

BS: Ok, I will try to follow those rules—that does clarify things.

By the way, I just noticed that you wrote N⊧β(b) even though b might be nonstandard. Does that mean we are talking about N satisfying a nonstandard formula?

MW: No—in that formula, b is a name, so β(b) is a standard formula even if the model element named by b is nonstandard.

BS: What are “names”?

MW: New constants. Suppose we have a structure A for a language L. So L has relation symbols, maybe also function symbols and constants. A has corresponding relations, functions, and elements. We don’t necessarily have a constant for every element of A. A standard trick in model theory is to expand L by adding a new constant for every element of A. Many people (including me) use LA to denote the expanded language, and call the new constants names. (I say a little bit more about them in this Topics post.)

BS: Thanks—these details do help!

Back to the PHT—my biggest question was about how we “really” prove M is a model of PA, since the stated method in Post 9 sounded like it was trying to do more than ought to be needed—it seemed to want M to be “similar to N” in some sense.

MW: Good place to start. Are you familiar with the concepts of elementary equivalence and elementary submodel?

Let me give a précis anyway. Suppose you have a first-order language L and two structures A and B for this language. A and B are elementarily equivalent if they satisfy exactly the same first-order sentences: A ⊧ φ iff B ⊧ φ for any closed formula φ of L. If A is a substructure of B, then A is an elementary substructure of B if for any first order formula $\varphi(\vec{x})$ in L and any $\vec{a}$ in A, then $A\models\varphi(\vec{a})$ iff $B\models\varphi(\vec{a})$. (If A and B are both models of a theory T in the language L, then we say ‘elementary submodel’ instead.) By the way, I use the notation $\vec{a}$ as shorthand for a1,…,an. The ai’s are names.

‘Elementary substructure’ is stronger than ‘elementary equivalence’, since it allows for names of elements of A in the formulas.

BS: So do I understand correctly that ‘elementary’ basically means ‘every formula is absolute’ (when comparing the two structures involved)?

MW: Yes, that’s right. Absolute between the two structures. For elementary equivalence, this absoluteness holds for all closed formulas of L. For elementary substructure, for all closed formulas of the expanded language LA.

BS: It feels like this post is all “preliminary” so far—though it is certainly a necessary and helpful discussion (for me anyway).

MW: Well, I think we’ve gotten a little ways in. We have the setup: N is a nonstandard model of PA, B is a subset of “diagonal indiscernibles” of N (whatever those are), M is the downward closure of B. We know that N satisfies the PHP, by hypothesis. We know the goal of the proof: to show that M is a model of PA, and that M does not satisfy the PHP.

I brought up elementary equivalence for the following reason. If we could show that M was elementarily equivalent to N, then it would follow immediately that M was a model of PA. That’s how I thought it went for a moment, the first time I read the proof. But that can’t be right, because PHP holds in N and not in M!

I think you were getting at this when you wrote, “it seemed to want M to be ‘similar to N’ in some sense”. That is an aspect of the proof: the authors demonstrate a “transfer principle” that transfers certain statements back and forth between M and N. This transfer principle is then used to show that M is a model of PA. But the principle is more subtle than elementary equivalence, or elementary substructure.

BS: Ok—I am eager to see where all this goes, in the next post!

Prev TOC Next

Filed under Conversations, Peano arithmetic

Topics in Nonstandard Arithmetic 5: Truth (Part 2)

Prev TOC Next

Last time we looked at Tarski’s inductive definition of truth, expressed informally. We saw how for models of PA, it can be formalized as an infinite sequence of formulas True0, True1, …, formulas belonging to L(PA) itself. But not as a single formula in L(PA).

“Expressed informally” means, “in naive set theory”. ZF wouldn’t be worth its salt if it couldn’t handle Tarski’s definition. But there’s a catch. We know there has to be a catch, because one of the incarnations of the Undefinability Theorem says that no single formula in L(ZF) can define truth for all sentences of L(ZF). That is, for the “real universe” V.

Let L be a first-order language. L has a set of relation symbols, a set of function symbols, and a set of constants. So L can be packaged up as a set, once we establish some coding conventions; I won’t bother to spell them out. PA can handle syntax via Gödel numbering; it’s even easier in ZF. Syntactic elements, after all, are combinatorial objects, and ZF can deal with that! For example, I’m going to regard formulas in L as parse trees.

Let S be a set-based structure for a language L. This means that the domain of S (dom(S)) is a set and not a proper class. Therefore the relations and functions on the domain must also be sets. In fact, the whole structure can be packaged up as a set—the set S. Again I won’t spell out conventions, but they’re not hard to cook up. S should be a tuple, say (dom, rel, const, …); there should be functions telling us which relations of S interpret which relation symbols of L, etc.

Formalizing the Tarski definition for S as a structure for L, one obtains a formula True in L(ZF), with three free variables:

True(L,S,φ) means
S
is a set-based structure for L and φ is a sentence of L and S ⊧ φ.

There’s also a parametrized version, True(L,S,φ,u), where u is a finite list of elements of dom(S) to be assigned to the free variables in φ.

Let me flesh out the ideas, without going full formal. We start with the representation of formulas as parse trees. For example ∃x(φ(x)∧ψ(x)) has a node labeled ∃ with a child labeled ∧, which in turn has two subtrees for φ(x) and ψ(x). For clarity, I’ll label nodes with complete subformulas: ∃x(φ(x)∧ψ(x)) and φ(x)∧ψ(x) instead of just ∃ and ∧, for example.

Even if the tree represents a closed formula, it may contain subtrees whose formulas are not closed. I’m going to associate an instantiation tree with the parse tree. The nodes of the instantiation tree contain sets of closed formulas, in the expanded language where we’ve added a name for every element of the domain of S. Here’s the idea: let’s say a subtree of the parse tree represents a subformula ψ(x1,…,xn), with free variables as shown. We populate the corresponding node of the instantiation tree with all possible closed formulas ψ(c1,…,cn), where the c‘s are names of elements of dom(S).

Here’s a picture for ∃x(φ(x)∧ψ(x)).

(Since φ(x) and ψ(x) represent subformulas with their own internal structure, there are more nodes below the ones displayed. Also, I’ve picked three names a,b,c to suggest the whole crew of all possible names. Add “…” if you like.)

The truth evaluation of an instantiation tree assigns a truth-value to every formula living in the tree. The rules are just what you’d expect. For example, φ(a)∧ψ(a) is assigned ‘true’ iff φ(a) and ψ(a) are both assigned ‘true’. And a node ∃xθ(x) is assigned ‘true’ iff some formula θ(c) of the child node is assigned ‘true’.

ZF is powerful enough to:

• define the notion of an instantiation tree;
• show that there is exactly one instantiation tree for every parse tree;
• define the notion of a truth evaluation function eval for an instantiation tree;
• prove that there is exactly one such eval for any instantiation tree.

Informally you do a “bottom up” evaluation, just the way compilers evaluate expressions. When you formalize all this in ZF, you’ll end up something like this, for the definition of eval:

eval is a function from the closed formulas in the nodes of the instantiation tree X to {true, false}, and
(∀ nodes p in X) (∀ formulas φ in p) […]

There’s a lot of hair in those three dots, […]. Stuff saying how eval(φ∧ψ) has the right value, likewise eval(∃xθ(x)). We do a scan over the tree, checking things locally (i.e., for a node and its immediate children). Here is the pivotal point: we’ll need one more quantifier, to handle the case of eval(∃xθ(x)). We won’t need a quantifier for each quantifier appearing in the tree—the tree could have a thousand ∃’s, but our definition of eval will still need only those three ∀’s I mentioned. Syntactic quantifiers that are “buried in the parse tree” don’t “bleed through” to the ZF formula True(L,S,φ).

You can see why this definition requires the structure S to be set-based. If dom(S) wasn’t a set, but a proper class (like the “class of all sets” V), then the nodes of the instantiation tree would have to contain proper classes of formulas. But ZF forbids proper classes, at least formally. Even Gödel-Bernays set theory (GB) is very sniffy about them: a proper class can’t be an element of anything else, so it certainly can’t live inside a node of a tree. (Informally, people often refer to classes when discussing ZF. But this is just a way of referring to formulas in L(ZF) with a free variable.)

The upshot: True(L,S,φ) does not apply to the “real universe” (V,∈) as a structure for L(ZF). (V,∈) may be a ‘structure’ (depending on your definitions), but it’s not a set-based structure. Thus we avoid a confrontation with the Undefinability Theorem.

Last time we mentioned a nice property of Trued:

For all φ of depth at most d,      N ⊧ φ  iff  N ⊧ Trued(⌜φ⌝).

We can prove something analogous, informally. For any sentence φ of L,

S ⊧ φ  iff  True(L,S,φ).

But we can’t do this formally in ZF, because True(L,S,φ) is our formalization of “S ⊧ φ”! (However, ZF can prove the equivalence of “N ⊧ φ” and “N ⊧ Trued(⌜φ⌝)” for shallow enough φ’s, now that we know how to formalize this in ZF.)

For all φ of depth at most d,      PA ⊢ φ↔Trued(⌜φ⌝).

Again we have nothing really analogous. But the notions of ‘theory’, ‘proof’, and ‘consistency’ can be formalized in ZF. Then we have for any theory T in a language L

ZF ⊢  (ST) ∧ (T ⊢ φ) →(S ⊧ φ)

The right hand side is vernacular: we formalize “ST” and “S ⊧ φ” using the formula True, and “T ⊢ φ” using the ZF formalization of ‘proof’. Using this, ZF can formally prove that if a theory has a model, it is consistent. As for the converse, the usual proofs of the Completeness Theorem can be done inside ZFC.

Closing remark: I’ve written φ rather than ⌜φ⌝, because I said that formulas are parse trees. But it’s sometimes useful to think of formulas more informally, and to say that the parse tree, as a set, is the “Gödel set” of φ, and denote it by ⌜φ⌝.

Prev TOC Next

Filed under Peano arithmetic

Topics in Nonstandard Arithmetic 4: Truth (Part 1)

In post 15 of the Conversation, I observed:

• Gödel’s two most famous results are the completeness theorem and the incompleteness theorem.
• Tarski’s two most famous results are the undefinability of truth and the definition of truth.

The second bullet has occupied its share of pixels in the Conversation. Time for a summing up.

Filed under Peano arithmetic

Non-standard Models of Arithmetic 20

Trudy Campbell

MW: OK, let’s recap the setup: we have a three-decker ωUUV. So far as U is concerned, ωU is the “real, true omega”. V knows it isn’t. Enayat’s question: what properties must an omega have, for it to be the omega of a model of T? Here T is a recursively axiomatizable extension of ZF, and U is a model of it.

1 Comment

Filed under Conversations, Peano arithmetic

Topics in Nonstandard Arithmetic 3: The Arithmetic Hierarchy (Part 2)

Last time I defined ∃n and ∀n prefixes and formulas; Σn, Πn , and Δn relations (and functions) on ℕ; Σn(PA), Πn(PA), and Δn(PA) formulas in L(PA); and Σn(N), Πn(N), and Δn(N) relations (and functions) on a model N of PA. I won’t repeat all that, but a few bullet points may help load it into working memory: