Topics in Nonstandard Arithmetic 10: Truth (Part 4)

Prev TOC Next

Previous “Truth” post

This post is a bit long, so I’m splitting it into sections, with a mini-TOC:

  1. Intro
  2. The Completeness Theorem
  3. Inside PA
  4. Proof Sketch
  5. Satisfaction
  6. End Extensions


We’ve already looked at three ways to get around Tarski’s Undefinability Theorem:

  1. Trued(φ), a truth predicate in L(PA) for sentences (closed formulas) of L(PA) of parse-tree depth at most d.
  2. True(L,S,φ), a truth predicate in L(ZF) for sentences in a language L, with S being a set-based structure for L.
  3. SatΣn(φ), a truth predicate in L(PA) for Σn sentences of L(PA).

Strictly speaking I should have written ⌜φ⌝ instead of φ, where ⌜φ⌝ is the Gödel number for the closed formula φ (or Gödel set in case (2)). But we’re among friends, so I abused notation a little. In all three cases, we can modify the predicate to cover all formulas: we add an argument to code a list of parameters corresponding to the free variables in φ.

All three of these truth predicates rest on top of Tarski’s original definition, formalizing the details differently. They sneak around the barrier of Tarski’s Undefinability Theorem by being approximations. I think this is clear enough for (1) and (3). For (2), the Undefinability Theorem forbids a definition of truth for ZF formulas as a ZF formula. The universe of ZF, of course, is not a set: (V,∈) is not a set-based structure. But we can approximate (V,∈) with (Vα,∈), where Vα is one of the layers of the cumulative hierarchy (and hence a set). The Reflection Principle of ZF helps out, providing “good enough” approximations for many purposes.

The Arithmetized Completeness Theorem (ACT) takes a different tack. It’s based on proofs of the Completeness Theorem, which reverse the roles of structure and satisfaction as found in Tarski’s definition. With Tarski, first we have a structure; this furnishes the definition of truth for atomic sentences. Then we use an induction to extend the definition to all sentences. With the usual proofs of the Completeness Theorem, we start by deciding which sentences should hold, and only then do we use this to define the structure. (I’m oversimplifying a bit; see below.)

Side comment: this reminds me of a quote from Cohen’s historical paper “The Discovery of Forcing”:

…the [generic] set a will not be determined completely, yet properties of a will be completely determined on the basis of very incomplete information about a. I would like to pause and ask the reader to contemplate the seeming contradiction in the above. This idea as it presented itself to me, appeared so different from any normal way of thinking, that I felt it could have enormous consequences. On the other hand, it seemed to skirt the possibility of contradiction in a very perilous manner. Of course, a new generation has arisen who imbibe this idea with their first serious exposure to set theory, and for them, presumably, it does not have the mystical quality that it had for me when I first thought of it. How could one decide whether a statement about a is true, before we have a? In a somewhat exaggerated sense, it seemed that I would have to examine the very meaning of truth and think about it in a new way.

As many have noted, the definitions of forcing and of truth have more than a nodding acquaintance. End side comment.

The Completeness Theorem

The Completeness Theorem says that if T is a consistent theory in a first-order language L, then T has a model M. Some people call this the Model Existence Theorem, since Gödel’s original formulation differed. But it takes just a hop to go from one version to the other, so I won’t distinguish. The countable case will be all we’ll need, where L and M are both countable.

Time to sketch the proof due to Henkin, the one you’ll find in most textbooks. (Below, we’ll see how this can be carried out inside PA for a theory that is representable in L(PA).)

Let T be a consistent theory in a countable language L. We have to construct a (countable) model M of T. The proof breaks down into four steps. I’ll first say a bit about each step, reserving some details for later.

(a) We add a bunch a constants to L (called witnesses), thus expanding L to a language L°. The universe of M will ultimately emerge from these witnesses.

Each witness c comes with a so-called witnessing axiom (also called a Henkin axiom), of the form


T plus all the witnessing axioms forms a consistent theory T° in the language L°.

(b) Next we extend T° to a complete consistent theory T*, also in the language L°.

(c) Next we build a model M out of the constants of L°, including all the witnesses (the so-called canonical model). We define an equivalence relation on the set of constants: cd iff T*c=d. Elements of M are equivalence classes of constants. The interpretation of a constant c is, of course, its class [c]. For any predicate letter R of L, define the relation RM on M thus: RM([c],[d]) holds iff T*R(c,d). For a function symbol f, define fM([c])=[d], where T*f(c)=d. (Likewise for relations and functions of any arity, of course.)

(d) Finally we have to show that M is a model of T*.

The details of step (d) motivate the details of step (a), so we’ll begin with those. First note that a sentence φ belongs to T* iff T*⊢φ, because T* is consistent and complete. We show that φ belongs to T* iff M⊧φ by induction on formula complexity (as usual). The logical connectives offer no resistence: T*⊢φ∧ψ ⇔ (T*⊢φ and T*⊢ψ) ⇔ (M⊧φ and M⊧ψ) (by induction) ⇔ M⊧φ∧ψ. Likewise for ¬: T*⊢¬φ ⇔ not T*⊢φ (because T* is consistent and complete) ⇔ not M⊧φ ⇔ M⊧¬φ.

Existential quantifiers next. In one direction, no sweat: M⊧∃xφ(x) ⇒ M⊧φ(c) for some [c] in M ⇒ T*⊢φ(c) ⇒ T*⊢∃xφ(x). But for the reverse, we need T*⊢∃xφ(x) to imply T*⊢φ(c) for some specific c. That’s why we have witnesses! With a witness c and a witness axiom ∃xφ(x)→φ(c) belonging to T° (a fortiori to T*), we’re home free.

So let’s turn to step (a). Because L is countable, we can enumerate all sentences of the form ∃xφ(x), and toss in a witness with witness axiom for it. Let’s call the result T1 in the language L1. Since L1 boasts a crop of new constants, it also has a bunch of new existential sentences. Add witnesses (with axioms) for these and call the result T2 in L2. Inductively define Ln and Tn for every n∈ℕ. Let L° be the union of all the Ln‘s, likewise T°.

We need to show that T° is consistent. Clearly it’s enough to show that Tn+1 is consistent if Tn is—indeed, that adding any one witness, with axiom, cannot introduce an inconsistency. Intuitively, if you know that ∃xφ(x), then you should be able to say, “OK, let c be an element such that φ(c).” (Because c is a brand-new constant, it has no “baggage”.) Every version of first-order logic offers some mechanism to formalize this sort of argument, common in non-formal mathematics, and which gets at the heart of existential quantification. I know this is rather vague; filling in the details would take many paragraphs, so I’ll leave that to the textbooks.  The upshot is that T° is consistent.

Step (b): Enumerate all the sentences φ1, φ2,… of . Let T(0) = T°. If T(0)+{φ1} is consistent, let T(1) be that; otherwise, let T(1) be T(0)+{¬φ1}. (One of these has to be consistent.) Let’s say ψ1 is φ1 in the first case, ¬φ1 in the second. Inductively, if T(n) is +{ψ1,…,ψn}, let T(n+1) be T(n)+{φn+1} if that’s consistent, T(n)+{¬φn+1} otherwise, and let ψn+1 be φn+1 respectively ¬φn+1. It’s not hard to see that T* = +{ψ12,…,} is consistent and complete.

Only one remark about (c): the language L° has a name for every element of the model, since it’s built out of the constants (including witnesses).

Inside PA

We can state the ACT in “slogan” form (for a model N of PA) as

If N⊧Con(T), then there is an M with MT

or even stronger,

PA ⊢ Con(T) → MT

These require some unpacking. Both T and M are shorthand for bunches of formulas of L(PA), representing respectively the first-order theory T (in some language L) and the model M built from it. As usual, we code formulas, proofs, etc. (all of syntax) as Gödel numbers. For example, we’ll have a formula  Sentence(⌜φ⌝) expressing the fact that ⌜φ⌝ is the Gödel number of a sentence (closed formula) φ.

It makes life more pleasant if we elide the distinction between a formula and its Gödel number, and write just φ instead of ⌜φ⌝. For example, we’ll have a function (φ,ψ)↦φ∧ψ, more strictly (⌜φ⌝, ⌜ψ⌝)↦⌜φ∧ψ⌝. (Imagine that formulas are integers.)

For the ACT, the theory T must be representable aka arithmetic: we must have a formula with one free variable, θ(x), such that φ belongs to T iff ℕ⊧θ(φ). Con(T), finally, will be a sentence in L(PA) expressing the consistency of T.

As for M, Henkin’s proof builds M from the “raw material” of L. When L is countable, M will be too. So we can take the domain of M to be a subset of ℕ. The relations of M will be relations on ℕ, etc. All this semantics will be definable via formulas of L(PA) (i.e., arithmetic) when T is. For example, we’ll have a formula δ(x) that holds when x belongs to the domain, and a formula ρ(R,(s1sk)) that holds when MR(s1sk). (Here we appeal to our convention that inside ρ, R and (s1sk) really mean their Gödel numbers.)

In particular, we will have a formula SatM(φ) representing “M ⊧φ”. A peculiarity of the construction: one defines SatM directly, not in terms of ρ(R,(s1sk)). Indeed, ρ is then defined just as a special case of SatM.

Now what if N is a nonstandard model of PA? Then all our formulas can be interpreted in N. Thus we have “nonstandard formulas and proofs”. It may help for the moment to surface the Gödel coding. If N⊧Sentence(⌜φ⌝) and  ⌜φ⌝ is a nonstandard element of N, then φ is a “nonstandard sentence”. In the most interesting cases, nonstandard formulas will have nonstandard lengths.

Likewise, if N⊧SatM(⌜φ⌝ ) then M ⊧φ. We have a definition of “truth” even for nonstandard sentences! We will return to this below.

Proof Sketch

With this mindset, let’s look at step (a) in the proof of the Completeness Theorem. With a grand handwave, I assert that this step is “nothing but syntax” and so can be formalized in PA. If you want to be meticulous, there’s a lot to check. First off, you should code your language L so that the Gödel numbers of formulas don’t use up all of ℕ, so as to leave room for L1, L2,… and ultimately all of  . The inductive step from Tn to Tn+1 is messy but clearly recursive in some sense. Standard mechanisms in PA handle inductive definitions for things like  and . Ultimately we end up with a formula θ° representing T°.

Things become interesting with step (b). Recall that φ123… is an enumeration of all the sentences of . The basic concept here is the full binary tree representing the combinations {φ1}, {¬φ1}, {φ12}, {φ1,¬φ2}, {¬φ12}, {¬φ1,¬φ2}, etc. Like so:

Binary Tree of Formulas


Each node is addressed with a bitstring: the root with the empty string, {φ1} and {¬φ1} with 0 and 1, the next four nodes with 00,01,10,11, etc. For now let’s concentrate on the tree aspects, putting aside the φ’s. The bitstrings can be enumerated recursively. Notions like “node x is at level n”, “y is the parent of x”, “y is a descendent of x”, are all expressible in PA. In what follows, I’ll make various combinatorial assertions; these can be proved in PA with rather routine inductions. (Note that with our conventions, descendents appear higher in the tree.)

A tree is given by a formula (say τ(x)) with the property that if τ(x) holds, then τ(y) holds for any ancestor y of x. Equivalently: for all x, if τ(x) holds then τ(y) holds for the parent y of x. We say a tree is infinite if it contains nodes at every level. Equivalently, it is infinite if it contains nodes at arbitrarily high levels: to find a node at level m, just find a node at some level n>m, and then its ancestor at level m will belong to the tree.

A path π in a tree τ is a subtree of τ such that if x and y belong to the path with x no higher than y, then x=y or x is an ancestor of y. Equivalently, π is a subtree of τ with at most one node at every level. The second definition follows from the first, because if x and y are at the same level then we must have x=y. Conversely, assume at most one node at each level. if x and y are at the same level then x=y; if y is higher than x, then y must have an ancestor x′ at x’s level, which must then be x.

If τ is an infinite tree, then it contains an infinite path. This is the well-known Kőnig Infinity Lemma. It is key for us that we can give a formula π(x), piggybacking on the formula τ(x), and PA can prove that π is an infinite path in τ. First let τ′(x) be the formula for “there are descendents of x in τ at every level above x”. Equivalently, “there are descendents of x in τ at arbitrarily high levels above x”:

n>level(x)[∃y(level(y)≥n ∧ y is a descendent of x]

These are equivalent for the same reasons that our two definitions of infinite tree are equivalent. In particular, τ is infinite iff the root belongs to τ′. Clearly τ′ is a tree (i.e., contains the parent of any of its nodes). Every node in τ′ has at least one child in τ′: for if the two children x0 and x1 of x have no descendents above (say) levels n1 and n2 respectively, then x has no descendents above level max(n1,n2). So if τ is infinite then τ′ is infinite.

Now let π(x) be “x is the leftmost node in τ′ at x’s level”. If π(x), then the parent y of x also belongs to π: we know τ′(y) holds, and if any node in τ′ at y’s level is to the left of y, then it would have a child in τ′ to the left of x. But that can’t be because x is in π. So π is “closed under parenthood”, thus is a tree. Obviously π has exactly one node at every level (since τ′ has at least one). So π is a path in τ.

Summarizing, we have a PA version of Kőnig’s Infinity Lemma. We’ve seen how to derive the formula π(x) from the formula τ(x), and prove (in PA) that if τ(x) defines an infinite tree, then π(x) defines an infinite path in that tree.

To finish step (b), we attach a finite set of sentences (say Φx) to each node in the full binary tree. (E.g., Φ011={φ1,¬φ2,¬φ3)}.) Let τ(x) be “x is consistent”. The root belongs to τ since T° is consistent. From basic logic, if a theory S is consistent, then either S+{φ} or S+{¬φ} must also be consistent, for any sentence φ. So each node in τ has at least one child, and τ(x) defines an infinite tree. Therefore it contains an infinite path π, defining our theory T*. T* is complete because π has a node at each level, and therefore either φnT* or ¬φnT* for every n. T* is consistent because a proof of a contradiction uses only a finite number of formulas; this would mean that x was already inconsistent for some x in τ.

Not a lot to say about step (c). Since L° is coded in PA, the constants are represented by numbers; we can choose the smallest number in the equivalence class of [c] to represent the class. So the domain of our model is a definable set of numbers, and the interpretations of the constants, relation symbols, and function symbols runs pretty much as described earlier. The model MT  is thus specified by a bunch of formulas in L(PA).

Finally step (d). Writing out the details formally would be unbearably tedious. But let’s keep sight of the big picture. Our definition of satisfaction for the model MT  is this: MT ⊧φ iff φ∈T*. It’s not Tarski’s inductive definition. But we can still carry out the arguments for step (d) from the previous section, and these show that our definition obeys all the clauses of the induction: φ∧ψ∈T* iff φ∈T* and ψ∈T*, ¬φ∈T* iff not φ∈T*, and ∃xφ(x)∈T* iff φ(a)∈T* for some constant a in L°.

Three more remarks. First: in the previous section, we assumed that the language L was countable. But once we’ve formalized the argument in PA, the result applies just as much to uncountable models of PA. You might say that an uncountable model N still “thinks that L is countable”.

Second: I haven’t bothered  to keep track of complexity (in the Δn sense). But if one does care about that, it’s good to make a tweak to the definition of τ(x) above. Instead of using “x is consistent”, replace that with “x has no proof of a contradiction with Gödel number less than n, where n=length(x)”. This definition is recursive in T, still works for the proof, and does not introduce an extra universal quantifier.

Third: we can generalize this entire discussion. Replace the formula θ(x) with a formula θ(x,y) in L(PA). Then pick an element aN and let θ(x,a) define the theory T. We will say the T is definable in N with a parameter. Everything in the section still works, resulting in a model and a truth predicate for it. Of course, the formulas defining the model, and the truth predicate, involve the parameter a.


Say θ(t) in L(PA) represents a theory T . The Arithmetized Completeness Theorem (ACT) tells us how to go from θ(t) to bunch of formulas in L(PA) intended to represent a model M. It says that PA proves the implication “Con(T)→M is a model of T”. Of course “Con(T)” and “M is a model of T” are written formally in terms of θ(t) and the derived formulas.

Briefly, consistent arithmetic theories have arithmetic models.

The key derived formula, call it SatM, represents “M⊧φ”. We noted above that SatM obeys all the inductive clauses of Tarski’s definition, although SatM is not defined that way.

What does this mean for a nonstandard N? As noted above, in that case we will have sentences of nonstandard length. And the formula SatM defines satisfaction in M, even for those formulas!

Tarski’s Undefinability Theorem says that we can’t parley his inductive definition of satisfaction into a formula. But reasoning outside of N, in the “real world”, we see that for standard formulas, SatM must agree the usual notion of truth—because SatM obeys all the inductive clauses.

What happens if we collide the Tarski’s Undefinability Theorem with the ACT? Suppose ℕ⊧Con(PA)—if it doesn’t, we might as well quit this blog and g0 watch movies. ACT gives us a model M⊧PA with an arithmetic formula for satisfaction. If ℕ were elementarily equivalent to M (over L(PA)), then “truth in ℕ” would agree with “truth in M”. In other words, if φ is a sentence in L(PA), and if ℕ≡M, then we’d have ℕ⊧φ ⇔ SatM(φ). But that would mean we had an arithmetic formula defining truth in ℕ, which can’t be by Tarski’s Theorem. So the model given by the ACT is not elementarily equivalent to ℕ.

What if we let T=Th(ℕ), so-called true arithmetic? Assuming ℕ⊧Con(T), wouldn’t the ACT give us a model MT with ℕ≡M? Ah, but Th(ℕ) is not arithmetic, so the ACT doesn’t apply.

We observed in post 4 that Tarski’s Theorem holds for any N⊧PA. So if N⊧Con(PA), then the ACT gives birth to a model of PA that is not elementarily equivalent to N (over L(PA)).

End Extensions

The ACT’s special skill is constructing end extensions of models of PA. Say T is an arithmetic theory extending PA (or just PA). Here’s the induction that imbeds ℕ as an initial segment of a model N of PA:

0 ↦ oM
n+1 ↦ nM  +M  1M

This can be formalized in PA. So if M is a model of PA, and if M⊧Con(T), then there is an end extension NeM with NT. Moreover, N is a model of T in the “real world”. Since N is not elementarily equivalent to M, N is a proper extension: NeM.

As a slight variation, we have a theorem of Manevitz:

If N⊧PA+Con(ZF), then there is a model of ZF whose ω is an end extension of N.

Same idea, just formalize the induction imbedding ℕ as an initial segment of the ω of a model of ZF.

ZF, as we know, can prove results about integers beyond the powers of PA.  A corollary to Manevitz’s theorem gives more information.

Recall from Topics 8 that Δ0 sentences are absolute for end-extensions: if φ is Δ0 and MeN, then N⊧φ iff M⊧φ. Thus Π1 sentences are downwards persistent for end-extensions: if φ is Π1 and MeN and N⊧φ, then M⊧φ. Now suppose that ZF⊢φ for a Π1 sentence φ. We will show that φ holds in any model M of PA+Con(ZF). By Manevitz’s theorem we know M has an end-extension N that is the ω of a model of ZF. Since ZF⊢φ, φ holds in N and thus by downward persistence also holds in M. We conclude:

If ZF⊢φ for a Π1 sentence φ, then PA+Con(ZF)⊢φ.

A theorem of McAloon uses the ACT plus some more twists:

If a nonstandard model M of PA has an extension K satisfying φ(c), where φ(x)∈L(PA) and cM, then M has an end-extension N satisfying φ(c). (Both K and N are assumed to be models of PA.)

At first this may seem rather trivial: K⊧PA+φ(c) tells us that PA+φ(c) is consistent, so by the ACT it has a model N, which must be an end-extension of M (as we’ve seen). Not so fast! The hypothesis of the ACT is M⊧Con(PA+φ(c)), and this can involve nonstandard formulas and proofs. PA+¬Con(PA) is famously consistent, but “nonstandardly inconsistent”—a model of it will contain a nonstandard proof of a contradiction. So we have to step carefully.

Let’s write PAn for the first n axioms of PA, in some recursive enumeration. Consider

ψ(n) ≡ Con(PAn+φ(c))

If we knew that M⊧ψ(n) for all standard n, we could use overspill to get M⊧ψ(r) for some nonstandard r. Then we could apply the ACT to the theory PAr+φ(c), resulting in an end-extension NeM with N a model of PAr+φ(c). Since PAr includes PA, we’d be done.

K⊧PA+φ(c) tells us PAn+φ(c) is consistent for all standard n. In other words, ℕ⊧Con(PAn+φ(c)). Now we pull a rabbit out of a hat: it follows from a theorem in proof theory that PA⊢Con(PAn+φ(c))! So M⊧ψ(n).

The proof of our “rabbit” involves our old friend SatΣd(φ), a truth predicate in L(PA) for Σd sentences. Make a big conjunction out of PAn+φ(c), but still write PAn+φ(c) to stand for this single sentence. Because ℕ⊧PAn+φ(c), we have ℕ⊧SatΣd(PAn+φ(c)) for some large enough d. At this point you might think we can declare victory: PAn+φ(c) is true in ℕ, 0=1 is false, and proofs preserve truth. Crucially, we have a notion of “truth” formalized in PA. That is, after all, how the proof of Con(PA) in ZF goes.

But there’s a hitch. The proof of 0=1 from PAn+φ(c) might pass through other sentences not belonging to Σd. However, a result in proof theory says that any such proof can be converted into one where all the intermediate sentences do belong to Σd. I confess I know almost nothing about the details, but the buzz phrases “cut-free proof” and “Gentzen’s consistency proof”. At any rate, all this reasoning can be carried out inside PA, and so, for the particular sentence Con(PAn+φ(c)), we can pass from satisfaction in ℕ to provability in PA. Recapping, that means M⊧Con(PAn+φ(c)) for every standard n, hence also for a nonstandard r, and hence we can apply the ACT to PAr+φ(c), giving the desired end-extension.

Previous “Truth” post

Prev TOC Next

Leave a comment

Filed under Peano Arithmetic

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.