Every specialty has its tricks of the trade. They become second nature to practitioners, so they often don’t make it into the textbooks. Quantifiers rule in logic; here are some of the games we can play with them. I’ll start with tricks that apply in logic generally, then turn to those specific to Peano arithmetic.
Exchanging quantifiers. Briefly, ∃∀⇒∀∃. More explicitly, ∃x∀yφ(x,y)⇒∀y∃xφ(x,y). This pops up with uniform convergence, as I mentioned in an earlier post. Note that the implication is one-way. Skolem functions shed more light on this. Suppose ∀y∃xφ(x,y) holds in a model. Choose for each y an f(y) such that φ(f(y),y); that’s a Skolem function for ∃xφ(x,y). In other words, we have
∀y ∃x φ(x,y) ⇔ ∃f ∀y φ(f(y),y)
(Yes, in general you’ll need the axiom of choice.) We’ve exchanged quantifiers, ∀∃ to ∃∀! But at the cost of a second-order quantifier, ∃f. However, if there exists a constant Skolem function f(y) ≡ x, then we’ll have ∃x∀y φ(x,y).
All this works more generally with tuples for x and y, and with strings of multiple quantifiers.
Turnstile jumping. People often refer to the proof (⊢) and satisfaction (⊧) symbols as turnstiles. Say T is a theory, and M is a model of T. We have two maneuvers I like to call turnstile jumping:
T ⊢∀x φ(x) ⇒ ∀a∈M T ⊢φ(a)
M ⊧∀x φ(x) ⇔ ∀a∈M M ⊧φ(a)
For the first one, we have a technical issue: the a in φ(a) must belong to the language of T. For PA and some other theories, we can use terms: φ(0), φ(1), φ(1+1), etc. In general we can add names to the language. For both maneuvers, the quantifier on the right is really a meta-quantifier, not part of the formal language L(T). Sometimes I distinguish meta-quantifiers by writing them out in prose, “for all a∈M”, but I prefer to use the symbols for this post.
The first “jump” is a one-way implication, the second an equivalence. For example, let T be PA. and let M be ℕ. Let φ(x) be “x is not the Gödel number of a proof of a contradiction”. For every n∈ℕ, we have ℕ⊧φ(n); not only that, but PA⊢φ(n), because verifying φ(n) amounts to a finite computation—in principle no different from checking that 2+2=4. Now ∀xφ(x) is the famous sentence Con(PA), unprovable in PA but true in ℕ.
You may wonder how turnstile jumping buys us anything. I’ll give some examples in a bit.
Cofinal extensions. Now let’s look at some PA-specific tricks. Suppose M⊆cfN, M and N being structures for L(PA). We have, for example
N ⊧ ∀x ∃y φ(x,y) ⇔ N ⊧ ∀m∈M ∀x<m ∃n∈M ∃y<n φ(x,y)
The left-to-right is obvious, but the right-to-left takes only a little thought. “∀m∈M ∀x<m” covers all the x’s in N, precisely because M is cofinal in N. If we want to get persnickety, “m∈M” doesn’t belong to L(PA). No problem: just expand the language by adding a new predicate symbol M, with M(m) interpreted as m∈M.
We can replace x, y with tuples, in which case means that each component of
is less than m. Likewise for
.
Let’s take these tricks out for a trial run. Recall Gaifman’s “mole disguise lemma” from the last post: If M⊆cfN, M⊧PA, N⊧PA−, and M≼Δ0N, then M≼N. (Note especially that the Induction Axioms are not initially assumed to hold in N, though of course M⊧PA and M≼N implies that they do.) The proof builds up from Δ0 to Π1 to Σ2 to Σn for all n. At each stage we have to show both upward and downward persistence. Π1 downward persistence is sort of obvious, since we’re prepending universal quantifiers to a Δ0 predicate. Still, a formal argument illustrates turnstile jumping nicely. Let the Δ0 predicate be δ; it can include parameters from M. Also, x and a can both be tuples.
N ⊧ ∀x δ(x)
⇔ ∀a∈N N ⊧ δ(a)
⇒ ∀a∈M N ⊧ δ(a)
⇔ ∀a∈M M ⊧ δ(a) (because M ≼Δ0 N)
⇔ M ⊧ ∀x δ(x)
I’ve indicated where the implications go both ways, although of course a single one-way implication makes the end-to-end implication one-way.
For upward persistence, we use the cofinality trick. (Here x and a can both be tuples, but p is not. Turnstiles are jumped without comment. Paramters from M are allowed in δ.)
M ⊧ ∀x δ(x)
⇔ M⊧ ∀p ∀a<p δ(x)
⇔ ∀p∈M M ⊧ ∀a<p δ(x)
⇔ ∀p∈M N ⊧ ∀a<p δ(x) (because M ≼Δ0 N)
⇔ N ⊧ ∀x δ(x) (because M ⊆cf N)
So from M≼Δ0N we’ve deduced M≼Π1N (assuming cofinality).
Upward Σ2 persistence follows from upward Π1 persistence, much as downward Π1 persistence followed from downward Δ0 persistence. For downward Σ2 persistence, we need a new trick: Collection.
Collection. The Collection scheme of PA says, roughly, that a finite set has a finite maximum. Formally:
PA ⊢ ∀x<p ∃y φ(x,y) ↔ ∃w ∀x<p ∃y<w φ(x,y)
for any formula φ(x,y) of L(PA). The formula can contain a parameter z (i.e., φ(x,y,z)). (Imagine an implicit ∀p and ∀z surrrounding the whole right-hand side.) As usual, you can replace x,y,z all with tuples. To see the truth of Collection in ℕ, pick a yi for each i<p such that φ(i,yi); then let w be the maximum of the yi’s.
People call these the Collection Axioms, although they’re theorems of PA. That’s because, apart from a niggle, you can replace the Induction Axioms with them. (The niggle: you still need Induction for Δ0 formulas.) Since they are theorems in PA, they hold even when our finite set is a “finite” set, i.e., a definable set with a nonstandard number of elements.
The most important application of Collection: prepending a bounded quantifier to a Σn or Πn predicate preserves the level. (I mentioned this in post 3.) If you peruse the example above, you’ll see that we’ve “moved the bounded ∀ past the ∃”—that is, (∀x<p)∃⇒∃(∀x<p). But it’s a peculiar sort of exchange of quantifiers, since we also changed variables: we didn’t move ∃y past (∀x<p), instead we bounded ∃y and stuck a new existential quantifier out front. Still, the effect is to move the bounded quantifier inward. We can keep doing it past any string of quantifiers, since universal quantifiers commute with each other.
To move a bounded existential quantifier inward, we use the contrapositive form:
PA ⊢ ∃x<p ∀y φ(x,y) ↔ ∀w ∃x<p ∀y<w φ(x,y)
Returning to the mole disguise lemma, we prove downward Σ2 persistence. This uses all of our tricks, as indicated.(Although I don’t point out instances of turnstile jumping). As before, δ stands for the Δ0 predicate, all the variables can be tuples except the bounds p, q, and w, and parameters from M are allowed in δ.
N ⊧ ∃x ∀y δ(x,y)
⇔ ∃p∈M N ⊧ ∃x<p ∀y δ(x,y) (cofinality)
⇔ ∃p∈M N ⊧ ∃x<p ∀q∈M ∀y<q δ(x,y) (cofinality)
⇒ ∃p∈M ∀q∈M N ⊧ ∃x<p ∀y<q δ(x,y) (∃∀⇒∀∃)
⇔ ∃p∈M ∀q∈M M ⊧ ∃x<p ∀y<q δ(x,y) (Δ0 absoluteness)
⇔ ∃p∈M M ⊧ ∀w ∃x<p ∀y<w δ(x,y)
⇔ ∃p∈M M ⊧ ∃x<p ∀y δ(x,y) (Collection)
⇔ M ⊧ ∃x ∀y δ(x,y)
Curiously enough, Collection is used to move a bounded quantifier outward. Notice that we use it only in M—we don’t know a priori that it holds in N.
“Finite” sequences. To finish the proof of the mole disguise lemma, we show that M≼ΣnN ⇒ M≼Σn+1N for all n≥2. Recall that M≼ΣnN iff M≼ΠnN. We’ll show that M≼ΣnN ⇒ M≼Πn+1N. As usual, the downward direction is easy. For upward persistence, we need a new trick.
“Finite” sequences takes Collection one step further. (“Finite” is in scare-quotes because nonstandard length sequences are ok.) Collection relied on the idea that a finite sequence {y0,y1,…,yn} has a finite maximum. But PA can code “finite” sequences of arbitrary length as single integers (see Gödel’s β-function). That means we can replace ∀x<p ∃y φ(x,y) with ∃w ∀x<p φ(x,(w)x), where (w)x stands for the x-th component of the sequence coded by w. Why would you want to? Well, it’s another way convert ∀∃ to ∃∀. (Very similar to Skolem functions.)
We have to tread carefully here. We actually have a predicate (w)x=z, not a new function symbol. That means φ(x,(w)x) gets transcribed as either ∃z((w)x=z ∧ φ(x,z)) or ∀z((w)x=z → φ(x,z)). As it happens, (w)x=z is Δ0, and since we have a choice of quantifier, the formula φ(x,(w)x) appears at the same level in the arithmetic hierachy as φ(x,y)—except if φ(x,y) is Δ0, then φ(x,(w)x) is Δ1.
OK, let’s use this to polish off the mole disguise lemma. Assume n≥2 and M≼ΣnN. Let φ(x,y) be Πn−1, so ∀x∃yφ(x,y) is Πn+1. We want to show that M ⊧ ∀x∃yφ(x,y) implies N ⊧ ∀x∃yφ(x,y).
M ⊧ ∀x ∃y φ(x,y)
⇔ ∀p∈M M ⊧ ∀x<p ∃y φ(x,y)
⇔ ∀p∈M M ⊧ ∃w ∀x<p φ(x,(w)x) (“finite” sequences)
⇔ ∀p∈M N ⊧ ∃w ∀x<p φ(x,(w)x) (Σn upward persistence)
⇒ ∀p∈M N ⊧ ∀x<p ∃y φ(x,y)
⇒ N ⊧ ∀x∃yφ(x,y)
Let’s check the level of ∃w ∀x<p φ(x,(w)x). Since n≥2, φ(x,y) (at Πn−1) is above Δ0, so φ(x,(w)x) is at the same level. Prefixing an ∃∀ takes us to Σn, so the “Σn upward persistence” step is justified.
The next-to-last step calls for close scrutiny. Remember, we cannot assume N satisfies PA. But as long as (w)x is defined in N, we’re good to go. In other words, we need
N ⊧ ∀w ∀x ∃z (w)x=z
But that sentence is Π2, and we already know that M≼Π2N. So we’re fine.
The mole disguise lemma has a corollary: every nonstandard model of PA has a proper elementary cofinal extension. Let M⊧PA be nonstandard. First we use compactness to get a proper elementary extension N≻M that is not an end extension. Here’s how: we let T be the theory consisting of (a) all closed sentences of L(PA)M that are true in M, where L(PA)M is L(PA) augmented with names for all elements of M; and (b) a new constant m, plus the axiom m<b for some nonstandard b in M, and all the axioms m≠a for every a∈M. Because b is nonstandard, we can satisfy any finite subtheory of T by choosing some element of M to be m. So T has a model N, which is an elementary extension of M by the (a) axioms, and has a mole m by the (b) axioms. (If you want to get fussy, N has an elementary submodel that is isomorphic to M.)
Now we do the Gaifman splitting: K={x∈N : ∃y∈M x<y}. We immediately have M ⊂cf K ⊆e N. (K contains the mole m, so it’s a proper extension.) It’s easy to see that K is closed under plus and times. So K is a initial substructure of N, which tells us that K≼Δ0N. (Recall from the last post that end extensions are always Δ0-elementary.) Downward Π1 persistence follows, implying that K satisfies PA−. We have M⊆cfK, M⊧PA, and K⊧PA−, everything we need for the lemma except M≼Δ0K. For that, we go up and then down: M≼N and K≼Δ0N, so a Δ0 sentence (with parameters from M) holds in M iff it holds in N iff it holds in K. Conclusion: M≺K, making K a proper elementary cofinal extension of M.
The Gaifman Splitting Theorem tells us that anytime we have PA models M⊆N, the Gaifman split M ⊆cf K ⊆e N produces a K that is an elementary extension of M (and hence also a model of PA). If you review the last paragraph, you’ll see that only one piece needs a replacement: the argument that M≼Δ0K. The MRDP Theorem fills the gap.
MRDP stands for Yuri Matiyasevich, Julia Robinson, Martin Davis, and Hilary Putnam. Before stating the theorem, an anecdote. After Matiyasevich provided the final stone in the proof of the theorem, he wrote a book, Hilbert’s Tenth Problem. Martin Davis contributed a forward. An excerpt:
During the 1960s I [Davis] often had occasion to lecture on Hilbert’s Tenth Problem. At that time it was known that the unsolvability would follow from the existence of a single Diophantine equation that satisfied a condition that had been formulated by Julia Robinson. However, it seemed extraordinarily difficult to produce such an equation, and indeed, the prevailing opinion was that one was unlikely to exist. … Inevitably during the question period I would be asked for my own opinion as to how matters would turn out, and I had my reply ready: “I think that Julia Robinson’s hypothesis is true, and it will be proved by clever young Russian.”
… I met Yuri a few months later at the International Congress of Mathematicians in Nice, where he was an invited speaker. I was finally able to tell him that I had been predicting his appearance for some time.
The MRDP theorem says that for every Σ1 predicate there is an equivalent ∃1 predicate. Moreover, the equivalence is provable in PA. Explicitly, for every formula ∃x δ(x,y) where δ is Δ0, there is a formula ∃x ξ(x,y) where ξ is quantifier-free, such that PA ⊢ ∀y(∃x δ(x,y) ↔ ∃x ξ(x,y)). (Taking negations, every Π1 predicate has an ∀1 equivalent.)
∃1 predicates are automatically persistent upward, and ∀1 predicates are persistent downward. Therefore between models of PA, Σ1 predicates are persistent upward and Π1 predicates are persistent downward. Thus Δ1 predicates are absolute (between PA models); a fortiori, so are Δ0 predicates. That’s the missing ingredient for the Gaifman Splitting Theorem.
The MRDP theorem is quite remarkable. You’d hardly expect every Δ0 formula to have a quantifier-free equivalent; in general, they don’t. But just slap an ∃ or an ∀ in front, and the distinction disappears! How’s that for a quantifier trick?