March | 2026 | Diagonal Argument

Prev TOC Next

Let’s look again at the notion of definability, rewritten slightly: for any set A, x⊆A is definable over A if there is a first-order formula φ(y,ū) and elements ā∈A such that

x={z∈A:φ^A(z,ā)}

where φ^A is φ relativized to A, i.e., all quantifiers in φ range only over A.

Very clearly the right-hand side depends on A. In some cases, we can write a formula for x that is independent of A. For example:

x={y} ↔ y∈x ∧ ∀z(z∈x→z=y)

Absoluteness is at the heart of Gödel’s proof that V=L holds in L. Failures of absoluteness present the main technical obstacles to showing that L satisfies ZF.

Three Non-absolute Notions

Let’s look at three non-absolute notions:

	x=𝒫(y)
	z is uncountable
	x=𝒫_un(y)={z⊆y:z is uncountable}

Now think about relativizing these three notions to L and to the L_α’s. Suppose x and y are both present in some L_α, and L_α⊧ x=𝒫(y). As we ascend to higher L_β’s, the assertion “x=𝒫(y)’’ can become false¹, because new subsets of y can appear in L_β. The reverse switch from false to true can’t happen: once we have a “witness” z to x≠𝒫(y) (i.e., z⊆y but z∉x, or z⊈y but z∈x), it won’t go away. (Note that the meaning of x and y can’t change, because the L_α’s are all transitive: by the time x shows up, all its elements have shown up. Ditto for y.)

Here’s a slightly different way to look at it. Consider the sequence x_α=𝒫^L_α(y). The sequence x_α is monotonically increasing (indeed, 𝒫^L_α(y)=𝒫(y)∩L_α). So the assertion x=x_α can flip from true to false as α increases². It can’t flip from false to true, because that would mean that x had some elements (subsets of y) that x_α was missing—but in that case x wouldn’t have been an element of L_α in the first place.

It’s a similar story for uncountability. We have:

L_α⊧ z is uncountable ↔ L_α⊧¬∃f(f:z↣ω)

where f:z↣ω is shorthand for saying that f is a injection from z into ω. If a witness f to the countability of z appears in some higher L_β, then z “becomes countable”, and remains countable after that.

Next we look at x=𝒫_un(y). Let x_α=𝒫_un^L_α(y). New uncountable subsets of y can appear at any time. But also, a subset of y that is uncountable in L_α can become countable later on. The upshot: 𝒫_un^L_α(y) can both gain and lose elements as α increases, and the truth-value of “x=𝒫_un(y)’’ can switch back and forth over and over again.

We will look at the limiting behavior (i.e., 𝒫^L(y) and 𝒫_un^L(y)) later on.

Absolute Formulas

How about absolute formulas? We say that a formula φ(ū) is absolute if for all ā in K

	K⊧φ(ā) ↔ V⊧φ(ā)
	(K a transitive class, ā∈K)

in other words, the truth-value of K⊧φ(ā) doesn’t depend on K, provided only that K is a transitive class. (To be precise, when we say K⊧φ(ā), we mean (K,∈)⊧φ(ā). Also, ā∈K means that all the a_i belong to K. Note that without ā∈K, K⊧φ(ā) makes no sense.)

Δ₀ formulas

A Δ₀ formula is one where all quantifiers are bounded, i.e., of the form (∀x∈y) or (∃x∈y).

Here’s the intuition behind Δ₀ formulas. In post 14, we defined the transitive closure of x to the elements of x, plus the elements of the elements of x, etc. Here we want the augmented transitive closure, where we add x as an element. Any transitive class containing x as an element contains its augmented transitive closure. In fact, it’s easy to see that the augmented transitive closure of x is the smallest transitive class containing x as an element, and that the augmented transitive closure is a set. OK: if φ(ā) is Δ₀, then to find out if K satisfies φ(ā), we only need to root around in the augmented transitive closure of the a_i’s. We never need to search through all of K. It looks like Δ₀ formulas should always be absolute.

One proves this by a routine induction on complexity. If φ(x̄) and ψ(x̄) are absolute, it’s immediate that ¬φ(x̄), φ(x̄)∧ψ(x̄), and φ(x̄)∨ψ(x̄) are absolute. As for (∃ū∈v)φ(ū,x̄), one direction is easy. Suppose for ā,b∈K we have K⊧(∃ū∈b)φ(ū,ā). Then

	K⊧(∃ū∈b)φ(ū,ā)
	⇒(∃c̄∈K) K⊧c̄∈b∧φ(c̄,ā)
	⇒(∃c̄∈V) V⊧c̄∈b∧φ(c̄,ā)
	⇒V⊧(∃ū∈b)φ(ū,ā)

In the other direction, again with ā,b∈K (as demanded by the definition of absoluteness)

	V⊧(∃ū∈b)φ(ū,ā)
	⇒(∃c̄∈V) V⊧c̄∈b∧φ(c̄,ā)
	⇒(∃c̄∈K) K⊧c̄∈b∧φ(c̄,ā)
	⇒K⊧(∃ū∈b)φ(ū,ā)

We know that c̄∈K in the third line because c̄∈b∈K and K is transitive. The inductive assumption takes us from V⊧φ(c̄,ā) to K⊧φ(c̄,ā).

Some examples of Δ₀ formulas:

Example 1: “x⊆y’’ is Δ₀, since it can be written (∀z∈x)z∈y.

Example 2: “x is an ordinal” is Δ₀. In post 17 we wrote out the clauses for this; for example, one was “(∀u,v∈x)(u<v∨v<u∨u=v)’’. (Recall that < is the same as ∈ for ordinals.) This is obviously Δ₀. The transitivity of x was “(∀u,v)(u∈v∈x→u∈x)’’, which we can rewrite as “(∀v∈x)(∀u∈v)u∈x’’. Only the last clause isn’t Δ₀, even rewritten this way:

(∀y⊆x)(y≠∅ → (∃u∈y) u∩y=∅)

The issue is “∀y⊆x’’. This does not count as a bounded quantifier. But Foundation makes this clause unnecessary.

Example 3: “f:x↣ y’’ is Δ₀, i.e., f is an injection of x into y. Intuition: We just need to dig inside the guts of f and its domain and range to show that f is a function and is 1–1.

(∀〈u,z〉∈f) (∀〈v,z〉∈f) u=v

says that f is injective. The vernacular “∀〈u,z〉∈f’’ expands to “(∀p∈f)p=〈u,z〉’’, and this presents no snags.

Example 4: “w=ω’’ is Δ₀, i.e., w is the first infinite ordinal. Here’s the formula for this:

w is an ordinal ∧ (∀y∈w)(y=∅ ∨ (∃x∈w)y=x⁺)

where x⁺=x∪{x}, the successor of x.

Example 5: f:y↣ω, i.e., f establishes that y is countable. This follows immediately from the last two examples.

In none of these examples do we ever need to climb outside the transitive closure of a given set and wander around the entire class K.

In contrast, we cannot check that K⊧x=𝒫(y) or that x is countable in K without surveying all of K, looking for (respectively) subsets of y and injections f.

Syntactic Analysis

With this in mind, we turn to a syntactic analysis of our non-absolute examples.

As noted, z⊆y and f:z↣ω are Δ₀. So x=𝒫(y) is of the form

∀z δ₁(x,y,z)

and “x is uncountable” is of the form

∀z δ₂(x,z)

where δ₁ and δ₂ are Δ₀.

Finally, x=𝒫_un(y) is of the form ∀z∀f∃g δ₃(x,y,z,f,g), where δ₃ is Δ₀. Showing this takes a bit of work. First we break down x=𝒫_un(y) into three parts:

	∀z(z∈x → z⊆y)
	∧ ∀z(z∈x → ∀f ¬(f:z↣ω))
	∧ ∀z((z⊆y ∧ ∀g ¬(g:z↣ω)) → z∈x)

We ask, how could this conjunction fail to be true? This way:

	∃z(z∈x ∧ z⊈y)
	∨ ∃z(z∈x ∧ ∃f f:z↣ω)
	∨ ∃z(z⊆y ∧ ∀g ¬(g:z↣ω) ∧ z∉x)

Now we use basic logical equivalences to move the quantifiers outwards. Say we have formulas φ, ψ(u), and ξ(u), where u does not appear in φ. Then

φ∧∃uψ(u)	↔∃u(φ∧ψ(u))
φ∨∃uψ(u)	↔∃u(φ∨ψ(u))
φ∧∀uψ(u)	↔∀u(φ∧ψ(u))
φ∨∀uψ(u)	↔∀u(φ∨ψ(u))
∃uψ(u) ∨ ∃uξ(u)	↔∃u(ψ(u)∨ξ(u))

So we can rewrite the failure of x=𝒫_un(y) as:

	∃z ∃f ∀g [
	(z∈x ∧ z⊈y)
	∨ (z∈x ∧ f:z↣ω)
	∨ (z⊆y ∧ ¬(g:z↣ω) ∧z∉x) ]

The stuff inside the brackets is Δ₀, so negating this gives us the form we claimed.

Thinking in terms of witnesses makes this more picturesque. Suppose x≠𝒫_un(y). In other words, x is accused of the crime of not being 𝒫_un(y). The prosecution and defence must provide witness lists before the trial starts. The prosecution lists z and f; the defence, all the g’s. Any one of the three disjuncts is sufficient to convict; let’s imagine a trial lasting three days. The witness z is called each day. On the first day, if z testifies to being an element of x but not a subset of y, game over. But suppose z surprises Jack McCoy (the prosecutor) by being a subset of y. On the second day, f is also called, to testify to the countability of z; McCoy hopes to show that z∈x. Too bad for McCoy, f’s testimony falls apart. On the third day, z is recalled and is shown to be both a subset of y, and not an element of x after all! The defence tries to argue that’s ok, because z is countable. He calls up every single g to testify to being the required injection, but each g fails. The jury convicts and McCoy repairs to the bar to have a drink with his ADA.

The Lévy Hierarchy

Summarizing the previous section:

x=𝒫(y)	is of the form	∀uδ₁(x,y,u)
x is uncountable	is of the form	∀uδ₂(x,u)
x=𝒫_un(y)	is of the form	∀u∀v∃wδ₃(x,y,u,v,w)

where δ₁, δ₂, and δ₃ are all Δ₀ formulas. This syntactic analysis fits into a scheme known as the Lévy Hierarchy.

Formulas of the form ∀uδ(x̄,u) are called Π₁ formulas; replace the ∀ with an ∃, and you’ve got a Σ₁ formula. (Of course, δ here stands for a Δ₀ formula.) More generally, any string of ∀’s is allowed at the front of a Π₁ formula, likewise any string of ∃’s at the front of a Σ₁ formula.

The negation of a Π₁ formula is Σ₁, and vice versa. Truth “propagates upwards” for Σ₁ formulas and “propagates downwards” for Π₁ formulas. The intuition is clear: if K⊧∃ūδ(ā,ū) for some ā⊆K with K a transitive class, then we have witnesses—elements c̄∈K such that K⊧δ(ā,c̄). The witnesses cannot be impeached by enlarging K, because δ is Δ₀. So ∃ūδ(ā,ū) holds also in any transitive class containing K. Likewise for the downward propagation with Π₁ formulas.

∀ū∃v̄δ(x̄,ū,v̄) is a Π₂ formula; its negation is a Σ₂ formula. A formula that looks like ∀ū∃v̄∀w̄…δ, where there are n alternating quantifier blocks, is a Π_n formula; starting off with an existential block gives a Σ_n formula.

A formula equivalent to both a Σ₁ and a Π₁ formula will thus be absolute—but that word “equivalent” is the kicker. Equivalent in what sense? One answer: φ(x̄) is Σ_n^ZF if there is a Σ_n formula ψ(x̄) such that ZF⊢∀x̄(φ(x̄)↔ψ(x̄)); likewise for Π_n^ZF. If a formula is both Σ_n^ZF and Π_n^ZF, we say it’s Δ_n^ZF. So Δ₁^ZF formulas are absolute between models of ZF. (And of course, Σ₁^ZF formulas are absolute upwards, Π₁^ZF absolute downwards, but only between models of ZF.)

This tradeoff between admitting more formulas or more classes can take a variety of forms. I won’t explore the full landscape, but a few aspects should be highlighted.

First let’s look at the role of bounded quantifiers. In Σ₁ formulas they must appear on the inside the scope of the unbounded ∃x̄. (∀x∈y)∃z(z∈x) is not Σ₁, for example.

If we restrict attention to ZF-models, then we can allow bounded quantifiers anywhere, and still get something equivalent to Σ₁. Example: the formula (∀x∈y)∃zφ(x,y,z) is ZF-equivalent to ∃u(∀x∈y)(∃z∈u)φ(x,y,z), by an argument involving ranks³. So we can migrate all bounded quantifiers to the inside.

Formulas like (∀x∈y)∃zφ(x,y,z) are absolute upwards for all transitive classes, not just models of ZF. The idea is simple: in quantifying (∀z∈y)… with y∈K, the bounded quantifier never asks us to go outside K, for a transitive K. So if the assertion holds for K, it will hold for any transitive M⊇K.

For “function-like” formulas, we have another trick. We encountered this notion in posts 13 and 17: φ(x̄,y) is function-like if for any x̄ there is a unique y making φ(x̄,y) true. That is, ∀x̄∃y∀z(φ(x̄,z)↔z=y).

Any formula that is absolute upwards between transitive classes K⊆M, and is function-like over both K and M, is in fact absolute between K and M. Proof: Suppose ā∈K, and we have both K⊧φ(ā,b) and M⊧φ(ā,c), with b∈K and c∈M. Because φ(x̄,y) is absolute upwards from K to M, we also have M⊧φ(ā,b). But since φ(x̄,y) is function-like over M, that means b=c. So K⊧φ(ā,c).

This result has a counterpart in the Lévy hierarchy. Suppose we have a Σ₁ formula

∃ūφ(x̄,y,ū)

where φ(x̄,y,ū) is Δ₀. Suppose also that K is a transitive class, and ∃ūφ(x̄,y,ū) is function-like for K. That is, for any c̄∈K, there is a unique d such that K⊧(∃ū)φ(c̄,d,ū). Then our formula is equivalent to this Π₁ formula over K:

∀ū∀z(φ(x̄,z,ū)→y=z)

I relegate the proof to the end of this post.

Now suppose that (∃ū)φ(x̄,y,ū) is function-like for both K and M with K⊆M. Then it is equivalent in both classes to a Π₁ formula; we might say it is Δ₁ for K and M, and hence absolute between them.

The Π_n/Σ_n classification is not confined to set theory; in a more general context, quantifier-free formulas play the role of Δ₀ formulas. Historically, proofs in logic often began by reducing formulas to prenex normal form (i.e., all quantifiers in front). This isn’t so widespread anymore. But induction on the “complexity” of formulas still pervades logic, and the Π_n/Σ_n classification is our deepest analysis of this complexity.

Here is the argument about function-like Σ₁ formulas. Suppose one of these formulas holds for (c̄,d). If the antecedent holds in the Π₁ formula for some ū with x̄=c̄ and z=d′, then the Σ₁ formula also holds for (c̄,d′). By the uniqueness hypothesis, d=d′ and the consequent holds for (c̄,d) in Π₁ formula. That shows that the Σ₁ formula implies the Π₁ formula. For the other direction, suppose the Π₁ formula holds for (c̄,d). By the existence part of function-likeness, there must be a ū and a d′ making the Σ₁ formula true for (c̄,d′). The Π₁ formula tells us that d=d′, so the Σ₁ formula holds for (c̄,d). This argument can be extended by induction to show that function-like Σ_n formulas are Π_n.}

[1] Just to be clear: by saying “x=𝒫(y) becomes false”, I mean that although L_α⊧x=𝒫(y), L_β⊧x≠𝒫(y). Here x and y are fixed elements of L, which both belong to L_α (and thus also to L_β).

[2] Again, to be clear, by “flip” I mean that x=x_α but x≠x_β for two ordinals α<β.

[3] For each x in y, let α_x be least ordinal such that there is a z of rank α_x making φ(x,y,z) true. Let ξ=sup_x∈yα_x, and set u=V_ξ. Later on I will call this sort of reasoning a waiting argument.

Prev TOC Next

The constructible universe is traditionally denoted L. L is a subclass of V and is a proper class. Gödel proved three things about L:

All the axioms of ZF hold in L, i.e., L is a model of ZF.
V=L holds in L, i.e., L is a model of the axiom “All sets are constructible”. Cohen: “This is a small but subtle point. It says that a constructible set is constructible when the whole construction is relativized to L.”
V=L→AC and V=L→GCH are both provable in ZF.

So we can’t prove not-GCH in ZF. If we could, it would have to hold in L, but GCH holds in L. Ditto for AC.

L is constructed according to the familiar transfinite scheme, using a function ℱ (discussed below):

L₀	= ∅
L_α+1	= ℱ(L_α)
L_λ	= ⋃_α<λL_α
L	= ⋃_α∈Ω L_α

Let A be a set. ℱ(A) is a subset of 𝒫(A); it’s the set of all sets that are definable using elements of A.

Here’s the precise definition. For any set A, x⊆A is definable over A if there is a first-order formula φ(y,ū) and elements ā∈A such that

z∈x ↔ z∈A ∧ φ^A(z,ā)

where φ^A is φ relativized to A, i.e., all quantifiers in φ range only over A. (Logic Notes §5 treats relativisation.) As we said before, ℱ(A) is the set of all subsets of A that are definable over A.

So ℱ(A) is something like 𝒫(A), except we include only those sets where we can explicitly describe their criterion for membership. Cohen discusses how this notion arose from, but did not resolve, concerns about so-called impredicative definitions. (We talked about this in post 3 on the paradoxes, and in post 5 on Zermelo’s proof of the well-ordering theorem.)

Some examples. The singleton {x}, the unordered pair {x,y}, the ordered pair 〈x,y〉={{x},{x,y}}, and the power set 𝒫(x) are all definable from x or from x and y:

z∈{x}	↔ z=x
z∈{x,y}	↔ z=x ∨ z=y
z∈〈x,y〉	↔ z={x} ∨ z={x,y}
z∈𝒫(x)	↔ ∀u[u∈z → u∈x]

Each right-hand side is a first-order formula characterizing the elements of a set. As usual, imagine the vernacular expanded. For example, instead of z={x}, we have ∀t(t∈z↔t=x). The left-hand sides are abbreviations for the right-hand sides, so (for example) in the formal definition of 〈x,y〉, “z={x}’’ and “z={x,y}’’ have been expanded.

The prime examples of sets not obviously definable are choice functions. For example, it’s easy to say what we desire of a choice function c for 𝒫(ℝ), where ℝ=𝒫(ω):

(∀s⊆ℝ) (s≠∅→c(s)∈s)

But this doesn’t characterize c. (We’ve already seen how to express formally “c is a function with domain 𝒫(ℝ)∖{∅}’’, and “c(s)∈s’’.)

Gödel’s L is an example of an inner model. The method of inner models proves relative consistency results: If a theory 𝒯 is consistent, then so is 𝒯+φ, where φ is a formula φ in ℒ(𝒯). To apply this method, you have find a formula α(x) in ℒ(𝒯), and show two things:

for all ψ∈𝒯,	𝒯⊢ψ^α
and also	𝒯⊢φ^α

where ψ^α is ψ relativized to α.

You can approach this method syntactically or semantically. First, semantics: Let’s say T is a model for 𝒯. Consider the substructure selected by α(x), call it A. It’s a model of 𝒯+φ, because each formula ψ∈𝒯, when interpreted as speaking about A, is equivalent to ψ^α interpreted in T:

A⊧ψ if and only if T⊧ψ^α

Likewise for φ. We’ve found a model of 𝒯+φ sitting inside a model of 𝒯.

Syntactically, say we had a proof of a contradiction in 𝒯+φ. Go through and relativize everything with α. Now we have a proof of a contradiction in 𝒯: all the relativized axioms of 𝒯+φ can be proved in 𝒯, and it turns out that relativization preserves the logical axioms and rules of inference. (Picky point: we need ∃xα(x) to hold too.)

Gödel’s treatment emphasized the syntactic aspect, Cohen’s the semantic.

Of course the hard part is proving the relativizations:

for all ψ∈ZF,	ZF⊢ψ^L
and also	ZF⊢(V=L)^L

So Con(ZF) → Con(ZF+V=L). People write ZFL for ZF+V=L, “All sets are constructible.” Gödel also showed that ZFL⊢AC and ZFL⊢GCH, giving relative consistency for these too.

Here’s a trivial example of the method of inner models. Let Group be the first-order theory of groups, and let abelian be the axiom x·y=y·x. Any model of Group (i.e., any group) has a model of Group+abelian sitting inside of it, namely its center. The formula

ζ(x) ↔∀y[x·y=y·x]

selects the center of the group. (Picky point: we can’t let the ∀y be implicit, since we need ζ(x) to define a unary relation.) Within Group, we can prove that the center of a group is an abelian group. It’s not totally trivial that the center of a group is even a group, i.e., that it’s closed under the group operation. Anyway, this argument shows the relative consistency result

Con(Group)→Con(Group+abelian)

admittedly a trivial result, but it illustrates the method.

Before Gödel’s L, the most prominent example of this method was von Neumann’s class of well-founded sets: V=⋃_α∈ΩV_α. This shows that if ZF minus Foundation is consistent, then ZF is too. The demonstration amounts to a much easier “dry run” for Gödel’s results.

Finally, let’s note an important feature of the inclusion L⊆V. Is it a proper inclusion, i.e., are there any non-constructible sets? Gödel thought so. So do most set-theorists who believe the question has meaning. For a formalist, the only question that has meaning is, what can you prove? Well, if we did have ZF⊢V≠L, that would mean that ZFL was inconsistent. By Gödel’s relative consistency result, that can happen only if ZF itself is inconsistent!

Cohen showed that ZF+V≠L is also consistent (if ZF is), so just like AC and GCH, whether V=L cannot be settled by the axioms of ZF. For a formalist, that’s the end of the story. For a platonist—some one who believes that the universe of set theory “really exists”—the question still has meaning. (Your platonism has to be at least moderately strong: you could believe that a multiverse of sets “really exists”, with V=L true in some universes and not in others.)

For what it’s worth, the consensus among set theorists of the platonist persuasion seems to be that AC is true, GCH is false, and V=L is also false.

Prev TOC Next

	Michael Weiss on Aristotle and Falling Obj…
	hypnosifl on Aristotle and Falling Obj…
	Michael Weiss on Set Theory Jottings 16. Axioms…
	Bruce Smith on Set Theory Jottings 16. Axioms…
	Michael Weiss on Set Theory Jottings 12. Zermel…

Monthly Archives: March 2026

Set Theory Jottings 22. Absoluteness

Set Theory Jottings 21. The Constructible Universe

Recent Posts

Recent Comments

Archives

Categories

Meta