This post is a bit long, so I’m splitting it into sections, with a mini-TOC:

We’ve already looked at three ways to get around Tarski’s Undefinability Theorem:

- True
(φ), a truth predicate in L(PA) for sentences (closed formulas) of L(PA) of parse-tree depth at most_{d}*d*. - True(
*L*,*S*,φ), a truth predicate in L(ZF) for sentences in a language*L*, with*S*being a set-based structure for*L*. - Sat
_{Σn}(φ), a truth predicate in L(PA) for Σsentences of L(PA)._{n}

Strictly speaking I should have written ⌜φ⌝ instead of φ, where ⌜φ⌝ is the Gödel number for the closd formula φ (or Gödel set in case (2)). But we’re among friends, so I abused notation a little. In all three cases, we can modify the predicate to cover *all* formulas: we add an argument to code a list of parameters corresponding to the free variables in φ.

All three of these truth predicates rest on top of Tarski’s original definition, formalizing the details differently. They sneak around the barrier of Tarski’s Undefinability Theorem by being *approximations*. I think this is clear enough for (1) and (3). For (2), the Undefinability Theorem forbids a definition of truth *for* ZF formulas *as* a ZF formula. The universe of ZF, of course, is not a set: (*V*,∈) is not a set-based structure. But we can approximate (*V*,∈) with (*V _{α}*,∈), where

*V*is one of the layers of the cumulative hierarchy (and hence a set). The Reflection Principle of ZF helps out, providing “good enough” approximations for many purposes.

_{α}The Arithmetized Completeness Theorem (ACT) takes a different tack. It’s based on proofs of the Completeness Theorem, which reverse the roles of structure and satisfaction as found in Tarski’s definition. With Tarski, *first* we have a structure; this furnishes the definition of truth for atomic sentences. *Then *we use an induction to extend the definition to all sentences. With the usual proofs of the Completeness Theorem, we start by deciding which sentences *should* hold, and only *then* do we use this to define the structure. (I’m oversimplifying a bit; see below.)

Side comment: this reminds me of a quote from Cohen’s historical paper “The Discovery of Forcing”:

…the [generic] set

awill not be determined completely, yet properties ofawill be completely determined on the basis of very incomplete information abouta. I would like to pause and ask the reader to contemplate the seeming contradiction in the above. This idea as it presented itself to me, appeared so different from any normal way of thinking, that I felt it could have enormous consequences. On the other hand, it seemed to skirt the possibility of contradiction in a very perilous manner. Of course, a new generation has arisen who imbibe this idea with their first serious exposure to set theory, and for them, presumably, it does not have the mystical quality that it had for me when I first thought of it. How could one decide whether a statement aboutais true, before we havea? In a somewhat exaggerated sense, it seemed that I would have to examine the very meaning of truth and think about it in a new way.

As many have noted, the definitions of forcing and of truth have more than a nodding acquaintance. End side comment.

Time to sketch the proof due to Henkin, the one you’ll find in most textbooks. (In the next section, we’ll see how this can be carried out inside PA for a theory that is representable in L(PA).) The countable case will be all we’ll need.

Let *T* be a consistent theory in a countable language *L*. We have to construct a (countable) model *M* of *T*. The proof breaks down into four steps. I’ll first say a bit about each step, reserving some details for later.

(a) We add a bunch a constants to *L* (called witnesses), thus expanding *L* to a language *L*°. The universe of *M* will ultimately emerge from these witnesses.

Each witness *c* comes with a so-called witnessing axiom (also called a Henkin axiom), of the form

∃*x*φ(*x*)→φ(*c*)

*T* plus all the witnessing axioms forms a consistent theory *T*° in the language *L*°.

(b) Next we extend *T*° to a complete consistent theory *T**, also in the language *L*°.

(c) Next we build a model *M* out of the constants of *L*°, including all the witnesses (the so-called canonical model). We define an equivalence relation on the set of constants: *c*∼*d* iff *T**⊢*c*=*d*. Elements of *M* are equivalence classes of constants. The interpretation of a constant *c* is, of course, its class [*c*]. For any predicate letter *R* of *L*, define the relation *R ^{M}* on

*M*thus:

*R*([

^{M}*c*],[

*d*]) holds iff

*T**⊢

*R*(

*c*,

*d*). For a function symbol

*f*, define

*f*([

^{M}*c*])=[

*d*], where

*T**⊢

*f*(

*c*)=

*d*. (Likewise for relations and functions of any arity, of course.)

(d) Finally we have to show that *M* is a model of *T*.*

The details of step (d) motivate the details of step (a), so we’ll begin with those. First note that a sentence φ belongs to *T** iff *T**⊢φ, because *T** is consistent and complete. We show that φ belongs to *T** iff *M*⊧φ by induction on formula complexity (as usual). The logical connectives offer no resistence: *T**⊢φ∧ψ ⇔ (*T**⊢φ and *T**⊢ψ) ⇔ (*M*⊧φ and *M*⊧ψ) (by induction) ⇔ *M*⊧φ∧ψ. Likewise for ¬: *T**⊢¬φ ⇔ not *T**⊢φ (because *T** is consistent and complete) ⇔ not *M*⊧φ ⇔ *M*⊧¬φ.

Existential quantifiers next. In one direction, no sweat: *M*⊧∃*x*φ(*x*) ⇒ *M*⊧φ(*c*) for some [*c*] in *M* ⇒ *T**⊢φ(*c*) ⇒ *T**⊢∃*x*φ(*x*). But for the reverse, we need *T**⊢∃*x*φ(*x*) to imply *T**⊢φ(*c*) for some specific *c*. That’s why we have witnesses! With a witness *c* and a witness axiom ∃*x*φ(*x*)→φ(*c*) belonging to *T*° (a fortiori to *T**), we’re home free.

So let’s turn to step (a). Because *L* is countable, we can enumerate all sentences of the form ∃*x*φ(*x*), and toss in a witness with witness axiom for it. Let’s call the result *T*_{1} in the language *L*_{1}. Since *L*_{1} boasts a crop of new constants, it also has a bunch of new existential sentences. Add witnesses (with axioms) for these and call the result *T*_{2} in *L*_{2}. Inductively define *L _{n}* and

*T*for every

_{n}*n*∈ℕ. Let

*L*° be the union of all the

*L*‘s, likewise

_{n}*T*°.

We need to show that *T*° is consistent. Clearly it’s enough to show that *T _{n}*

_{+1}is consistent if

*T*is—indeed, that adding any one witness, with axiom, cannot introduce an inconsistency. Intuitively, if you know that ∃

_{n}*x*φ(

*x*), then you

*should*be able to say, “OK, let

*c*be an element such that φ(

*c*).” (Because

*c*is a brand-new constant, it has no “baggage”.) Every version of first-order logic offers some mechanism to formalize this sort of argument, common in non-formal mathematics, and which gets at the heart of existential quantification. I know this is rather vague; filling in the details would take many paragraphs, so I’ll leave that to the textbooks. The upshot is that

*T*° is consistent.

Step (b): Enumerate all the sentences φ_{1}, φ_{2},… of *L°*. Let *T*_{(0)} = *T*°. If *T*_{(0)}+{φ_{1}} is consistent, let *T*_{(1)} be that; otherwise, let *T*_{(1)} be *T*_{(0)}+{¬φ_{1}}. (*One* of these has to be consistent.) Let’s say ψ_{1} is φ_{1} in the first case, ¬φ_{1} in the second. Inductively, if *T*_{(n)} is *T°*+{ψ_{1},…,ψ* _{n}*}, let

*T*

_{(n+1)}be

*T*

_{(n)}+{φ

_{n}_{+1}} if that’s consistent,

*T*

_{(n)}+{¬φ

_{n}_{+1}} otherwise, and let ψ

_{n}_{+1}be φ

_{n}_{+1}respectively ¬φ

_{n}_{+1}. It’s not hard to see that

*T**=

*T°*+{ψ

_{1},ψ

_{2,}…,} is consistent and complete.

Only one remark about (c): the language *L*° has a name for every element of the model, since it’s built out of the constants (including witnesses).

Let *T* be a consistent theory in a countable language *L*. As usual, we code formulas, proofs, etc. as Gödel numbers. *T* is *representable* in ℕ if there is a formula θ(*x*) in L(PA) with one free variable, such that φ belongs to *T* iff ℕ⊧θ(⌜φ⌝).

As I mentioned at the top of this post, it makes life more pleasant if we elide the distinction between a formula and its Gödel number, and write just θ(φ). Imagine that formula *are* integers. All the paraphenalia of syntax is nice and recursive: the functions (φ,ψ)↦φ∧ψ, φ∧ψ↦φ, and φ∧ψ↦ψ; (*c*,φ(*x*))↦φ(*c*); whether a formula follows from other formulas by a given rule of inference; etc.

The notions “*p* is a proof in *T*”, “*T*⊢φ”, and “*T* is consistent”, are all expressible in terms of θ. In other words, momentarily treat *T* as a new predicate symbol added to L(PA). We can then write formulas for all these notions in the expanded language. Substituting in the formula θ for each occurrence of *T*, we have formulas in L(PA) for all these notions.

(In fact, if θ is Δ_{n}, then “*p* is a proof in *T*” is also Δ_{n}, and “*T*⊢φ” and “*T* is consistent” are Σ_{n}. That’s because “*p* is a proof in *T*” is recursive in *T* in a suitable sense. But we won’t need that level of detail.)

Now if *N* is a nonstandard model of PA, then all these formulas for syntactic concepts can be interpreted in *N*. Thus we have “nonstandard formulas and proofs”. It may help for the moment to surface the Gödel coding. If Sentence(⌜φ⌝) is the formula expressing the concept “φ is a sentence”, then if ⌜φ⌝ is a nonstandard element of *N*, φ is a nonstandard sentence. In the most interesting cases, nonstandard formulas will have nonstandard lengths.

With this mindset, let’s look at step (a) in the proof of the Completeness Theorem. With a grand handwave, I assert that this step is “nothing but syntax” and so can be formalized in PA. If you want to be meticulous, there’s a lot to check. First off, you should code your language *L* so that the Gödel numbers of formulas don’t use up all of ℕ, so as to leave room for *L*_{1}, *L*_{2},… and ultimately all of *L°*. The inductive step from *T _{n}* to

*T*

_{n+1}is messy but clearly recursive in some sense. Standard mechanisms in PA handle inductive definitions for things like

*L°*and

*T°*. Ultimately we end up with a formula θ° representing

*T*°.

Things become interesting with step (b). Recall that φ_{1},φ_{2},φ_{3}… is an enumeration of all the sentences of *L°*. The basic concept here is the full binary tree representing the combinations {φ_{1}}, {¬φ_{1}}, {φ_{1},φ_{2}}, {φ_{1},¬φ_{2}}, {¬φ_{1},φ_{2}}, {¬φ_{1},¬φ_{2}}, etc. Like so:

Each node is addressed with a bitstring: the root with the empty string, {φ_{1}} and {¬φ_{1}} with 0 and 1, the next four nodes with 00,01,10,11, etc. For now let’s concentrate on the tree aspects, putting aside the φ’s. The bitstrings can be enumerated recursively. Notions like “node *x* is at level *n*”, “*y* is the parent of *x*”, “*y* is a descendent of *x*”, are all expressible in PA. In what follows, I’ll make various combinatorial assertions; these can be proved in PA with rather routine inductions. (Note that with our conventions, descendents appear *higher* in the tree.)

A *tree* is given by a formula (say τ(*x*)) with the property that if τ(*x*) holds, then τ(*y*) holds for any ancestor *y* of *x*. Equivalently: for all *x*, if τ(*x*) holds then τ(*y*) holds for the parent *y* of *x*. We say a tree is *infinite* if it contains nodes at every level. Equivalently, it is infinite if it contains nodes at arbitrarily high levels: to find a node at level *m*, just find a node at some level *n*>*m*, and then its ancestor at level *m* will belong to the tree.

A *path* π in a tree τ is a subtree of τ such that if *x* and *y* belong to the path with *x* no higher than *y*, then *x*=*y* or *x* is an ancestor of *y*. Equivalently, π is a subtree of τ with at most one node at every level. The second definition follows from the first, because if *x* and *y* are at the same level then we must have *x*=*y*. Conversely, assume at most one node at each level. if *x* and *y* are at the same level then *x*=*y*; if *y* is higher than *x*, then *y* must have an ancestor *x*′ at *x’s* level, which must then be *x*.

If τ is an infinite tree, then it contains an infinite path. This is the well-known Kőnig Infinity Lemma. It is key for us that we can give a formula π(*x*), piggybacking on the formula τ(*x*), and PA can prove that π is an infinite path in τ. First let τ′(*x*) be the formula for “there are descendents of *x* in τ at every level above *x*”. Equivalently, “there are descendents of *x* in τ at arbitrarily high levels above *x*”:

∀*n*>level(*x*)[∃*y*(level(*y*)≥*n* ∧ *y* is a descendent of *x*]

These are equivalent for the same reasons that our two definitions of *infinite tree* are equivalent. In particular, τ is infinite iff the root belongs to τ′. Clearly τ′ is a tree (i.e., contains the parent of any of its nodes). Every node in τ′ has at least one child in τ′: for if the two children *x*0 and *x*1 of *x* have no descendents above (say) levels *n*_{1} and *n*_{2} respectively, then *x* has no descendents above level max(*n*_{1},*n*_{2}). So if τ is infinite then τ′ is infinite.

Now let π(*x*) be “*x* is the leftmost node in τ′ at *x’*s level”. If π(*x*), then the parent *y* of *x* also belongs to π: we know τ′(*y*) holds, and if any node in τ′ at *y’*s level is to the left of *y*, then it would have a child in τ′ to the left of *x*. But that can’t be because *x* is in π. So π is “closed under parenthood”, thus is a tree. Obviously π has exactly one node at every level (since τ′ has at least one). So π is a path in τ.

Summarizing, we have a PA version of Kőnig’s Infinity Lemma. We’ve seen how to derive the formula π(*x*) from the formula τ(*x*), and prove (in PA) that if τ(*x*) defines an infinite tree, then π(*x*) defines an infinite path in that tree.

To finish step (b), we attach a finite set of sentences (say Φ_{x}) to each node in the full binary tree. (E.g., Φ_{011}={φ_{1},¬φ_{2},¬φ_{3})}.) Let τ(*x*) be “*T°*+Φ_{x} is consistent”. The root belongs to τ since *T*° is consistent. From basic logic, if a theory *S* is consistent, then either *S*+{φ} or *S*+{¬φ} must also be consistent, for any sentence φ. So each node in τ has at least one child, and τ(*x*) defines an infinite tree. Therefore it contains an infinite path π, defining our theory *T**. *T** is complete because π has a node at each level, and therefore either φ_{n}∈*T** or ¬φ_{n}∈*T** for every *n*. *T** is consistent because a proof of a contradiction uses only a finite number of formulas; this would mean that *T°*+Φ_{x} was already inconsistent for some *x* in τ.

Not a lot to say about step (c). Since *L*° is coded in PA, the constants are represented by numbers; we can choose the smallest number in the equivalence class of [*c*] to represent the class. So the domain of our model is a definable set of numbers, and the interpretations of the constants, relation symbols, and function symbols runs pretty much as described earlier. The model *M* is thus specified by a bunch of formulas in L(PA).

Finally step (d). Writing out the details formally would be unbearably tedious. But let’s keep sight of the big picture. Our definition of satisfaction for the model *M* is this: *M*⊧φ iff φ∈*T**. It’s *not* Tarski’s inductive definition. But we can still carry out the arguments for step (d) from the previous section, and these show that our definition obeys all the clauses of the induction: φ∧ψ∈*T** iff φ∈*T** and ψ∈*T**, ¬φ∈*T** iff not φ∈*T**, and ∃*x*φ(*x*)∈*T** iff φ(*a*)∈*T** for some constant *a* in *L*°.

Three more remarks. First: in the previous section, we assumed that the language *L* was countable. But once we’ve formalized the argument in PA, the result applies just as much to uncountable models of PA. You might say that an uncountable model *N* still “thinks that *L* is countable”.

Second: I haven’t bothered to keep track of complexity (in the Δ_{n} sense). But if one does care about that, it’s good to make a tweak to the definition of τ(*x*) above. Instead of using “*T°*+Φ_{x} is consistent”, replace that with “*T°*+Φ_{x} has no proof of a contradiction with Gödel number less than *n*, where *n*=length(*x*)”. This definition is recursive in *T*, still works for the proof, and does not introduce an extra universal quantifier.

Third: we can generalize this entire discussion. Replace the formula θ(*x*) with a formula θ(*x,y*) in L(PA). Then pick an element *a*∈*N* and let θ(*x,a*) define the theory *T*. We will say the *T* is definable in *N* with a parameter. Everything in the section still works, resulting in a model and a truth predicate for it. Of course, the formulas defining the model, and the truth predicate, involve the parameter *a*.

Say θ(*t*) in L(PA) represents a theory *T* . The Arithmetized Completeness Theorem (ACT) tells us how to go from θ(*t*) to bunch of formulas in L(PA) intended to represent a model *M*. It says that PA proves the implication “Con(*T*)→*M* is a model of *T*”. Of course “Con(*T*)” and “*M* is a model of *T*” are written formally in terms of θ(*t*) and the derived formulas.

Briefly, consistent arithmetic theories have arithmetic models.

The key derived formula, call it Sat_{M}, represents “*M*⊧φ”. We noted above that Sat_{M} obeys all the inductive clauses of Tarski’s definition, although Sat_{M} is not defined that way.

What does this mean for a nonstandard *N*? As noted above, in that case we will have sentences of nonstandard length. The formula Sat_{M} defines satisfaction in *M*, even for those formulas!

Tarski’s Undefinability Theorem says that we can’t parley his inductive definition of satisfaction into a formula. But reasoning *outside* of *N*, in the “real world”, we see that for *standard* formulas, Sat_{M} must agree the usual notion of truth—because Sat_{M} obeys all the inductive clauses.

What happens if we collide the Tarski’s Undefinability Theorem with the ACT? Suppose ℕ⊧Con(PA)—if it *doesn’t*, we might as well quit this blog and g0 watch movies. ACT gives us a model *M*⊧PA with an arithmetic formula for satisfaction. If ℕ were elementarily equivalent to *M* (over L(PA)), then “truth in ℕ” would agree with “truth in *M*”. In other words, if φ is a sentence in L(PA), and if ℕ≡*M*, then we’d have ℕ⊧φ ⇔ Sat_{M}(φ). But that would mean we had an arithmetic formula defining truth in ℕ, which can’t be by Tarski’s Theorem. So the model given by the ACT is not elementarily equivalent to ℕ.

What if we let *T*=Th(ℕ), so-called true arithmetic? Assuming ℕ⊧Con(*T*), wouldn’t the ACT give us a model *M*⊧*T* with ℕ≡*M*? Ah, but Th(ℕ) is not arithmetic, so the ACT doesn’t apply.

We observed in post 4 that Tarski’s Theorem holds for any *N*⊧PA. So if *N*⊧Con(PA), then the ACT gives birth to a model of PA that is not elementarily equivalent to *N* (over L(PA)).

The ACT’s special skill is constructing end extensions of models of PA. Say *T* is an arithmetic theory extending PA (or just PA^{−}). Here’s the induction that imbeds ℕ as an initial segment of a model *N *of PA^{−}:

0 ↦ o^{M}

*n*+1 ↦ *n ^{M}* +

*1*

^{M}

^{M}This can be formalized in PA. So if *M* is a model of PA, and if *M*⊧Con(*T*), then there is an end extension *N*⊇_{e}*M* with *N*⊧*T*. Moreover, *N* is a model of *T* in the “real world”. Since *N* is not elementarily equivalent to *M*, *N* is a proper extension: *N*⊃_{e}*M*.

As a slight variation, we have a theorem of Manevitz:

If *N*⊧PA+Con(ZF), then there is a model of ZF whose ω is an end extension of *N*.

Same idea, just formalize the induction imbedding ℕ as an initial segment of the ω of a model of ZF.

ZF, as we know, can prove results about integers beyond the powers of PA. A corollary to Manevitz’s theorem gives more information.

Recall from Topics 8 that Δ_{0} sentences are absolute for end-extensions: if φ is Δ_{0} and *M*⊆_{e}*N*, then *N*⊧φ iff *M*⊧φ. Thus Π_{1} sentences are downwards persistent for end-extensions: if φ is Π_{1} and *M*⊆_{e}*N* and *N*⊧φ, then *M*⊧φ. Now suppose that ZF⊢φ^{ℕ} for a Π_{1} sentence φ. We will show that φ holds in any model *M* of PA+Con(ZF). By Manevitz’s theorem we know *M* has an end-extension *N* that is the ω of a model of ZF. Since ZF⊢φ^{ℕ}, φ holds in *N* and thus by downward persistence also holds in *M*. We conclude:

If ZF⊢φ^{ℕ} for a Π_{1} sentence φ, then PA+Con(ZF)⊢φ.

A theorem of McAloon uses the ACT plus some more twists:

If a nonstandard model *M* of PA has an extension *K *satisfying φ(*c*), where φ(*x*)∈L(PA) and *c*∈*M*, then *M* has an end-extension *N* satisfying φ(*c*). (Both *K* and *N* are assumed to be models of PA.)

At first this may seem rather trivial: *K*⊧PA+φ(*c*) tells us that PA+φ(*c*) is consistent, so by the ACT it has a model *N*, which must be an end-extension of *M* (as we’ve seen). Not so fast! The hypothesis of the ACT is *M*⊧Con(PA+φ(*c*)), and this can involve nonstandard formulas and proofs. PA+¬Con(PA) is famously consistent, but “nonstandardly inconsistent”—a model of it will contain a nonstandard proof of a contradiction. So we have to step carefully.

Let’s write PA* _{n}* for the first

*n*axioms of PA, in some recursive enumeration. Consider

ψ(*n*) ≡ Con(PA* _{n}*+φ(

*c*))

If we knew that *M*⊧ψ(*n*) for all standard *n*, we could use overspill to get *M*⊧ψ(*r*) for some nonstandard *r*. Then we could apply the ACT to the theory PA* _{r}*+φ(

*c*), resulting in an end-extension

*N*⊃

_{e}

*M*with

*N*a model of PA

*+φ(*

_{r}*c*). Since PA

*includes PA, we’d be done.*

_{r}*K*⊧PA+φ(*c*) tells us PA* _{n}*+φ(

*c*) is consistent for all standard

*n*. In other words, ℕ⊧Con(PA

*+φ(*

_{n}*c*)). Now we pull a rabbit out of a hat: it follows from a theorem in proof theory that PA⊢Con(PA

*+φ(*

_{n}*c*))! So

*M*⊧ψ(

*n*).

The proof of our “rabbit” involves our old friend Sat_{Σd}(φ), a truth predicate in L(PA) for Σ* _{d}* sentences. Make a big conjunction out of PA

*+φ(*

_{n}*c*), but still write PA

*+φ(*

_{n}*c*) to stand for this single sentence. Because ℕ⊧PA

*+φ(*

_{n}*c*), we have ℕ⊧Sat

_{Σd}(PA

*+φ(*

_{n}*c*)) for some large enough

*d*. At this point you might think we can declare victory: PA

*+φ(*

_{n}*c*) is true in ℕ, 0=1 is false, and proofs preserve truth. Crucially, we have a notion of “truth” formalized in PA. That is, after all, how the proof of Con(PA) in ZF goes.

But there’s a hitch. The proof of 0=1 from PA* _{n}*+φ(

*c*) might pass through other sentences not belonging to Σ

*. However, a result in proof theory says that any such proof can be converted into one where all the intermediate sentences*

_{d}*do*belong to Σ

*. I confess I know almost nothing about the details, but the buzz phrases “cut-free proof” and “Gentzen’s consistency proof”. At any rate, all this reasoning can be carried out inside PA, and so, for the particular sentence Con(PA*

_{d}*+φ(*

_{n}*c*)), we can pass from satisfaction in ℕ to provability in PA. Recapping, that means

*M*⊧Con(PA

*+φ(*

_{n}*c*)) for every standard

*n*, hence also for a nonstandard

*r*, and hence we can apply the ACT to PA

*+φ(*

_{r}*c*), giving the desired end-extension.