First-Order Categorical Logic 5

Prev TOC

JB: Okay, let me try to sketch out a more categorical approach to Gödel’s completeness theorem for first-order theories. First, I’ll take it for granted that we can express this result as the model existence theorem: a theory in first-order logic has a model if it is consistent. From this we can easily get the usual formulation: if a sentence holds in all models of a theory, it is provable in that theory.

MW: Historically you’re right, of course—Gödel’s original formulation was different from the assertion about the existence of models. But nowadays I think when model theorists say “Gödel’s completeness theorem”, they mean both forms interchangeably. Kind of like when physicists refer to “Maxwell’s equations”, they don’t mean the twenty or so PDEs Maxwell originally wrote down!

JB: Okay. Wikipedia still distinguishes them; maybe someone should bring it up to speed, but probably it’s good for beginners to learn this distinction so they know what they’re doing when they gloss over it later.

Anyway: so far I’ve been using ‘model’ in the ordinary sense of first-order logic. But we need to remember what this amounts to in our hyperdoctrine approach. We’ll take our ‘theory’ to be a hyperdoctrine, that is, a functor

B \colon \textrm{FinSet} \to \textrm{BoolAlg}

obeying those extra conditions we discussed last time. (Don’t worry, we don’t need to remember them yet.) We’ll take a ‘model’ of B to be a natural transformation from B to some other hyperdoctrine

C \colon \textrm{FinSet} \to \textrm{BoolAlg} .

But right now C can’t be just any hyperdoctrine, since we’re interested in traditional models where we have a set V, our ‘universe’, and each n-ary predicate of B is interpreted as a subset of Vn. So we want C to be of the form where we choose a set V and let

C(n) = P(V^n)

Way back in Part 1 we saw this gives a functor in a nice way.

So, we basically need to show there is a natural transformation, say

\alpha \colon B \Rightarrow C

with C of this form, if B is consistent. Does this make sense?

MW: Perfectly. Except—did we ever say what it means for a hyperdoctine to be consistent?

JB: No! And this is one place where things get fun. What do you think it should mean?

MW: In classical logic, all sentences (closed formulas) of an inconsistent theory are equivalent. And conversely. So here’s what I think should be the case: for an inconsistent theory B, all the B(n)’s are the trivial boolean algebra (⊥=⊤), and for a consistent theory none of the B(n)’s are trivial.

(By the way, I recently ran across the neat terms ‘verum’ and ‘falsum’ for ⊤ and ⊥.)

JB: Interesting! I’m a simple-minded fellow so I would have guessed that a theory is inconsistent if ⊥=⊤ as propositions—that is, in B(0)—and consistent otherwise. But you’re raising an interesting question! Can we have ⊥=⊤ in some B(n)’s and not in others?

MW: Hmm. A hyperdoctrine is a functor from FinSet to BoolAlg (let’s call that a ‘scaffold’) with some extra conditions. I can think of at least three “pathological scaffolds”, occupying a no-man’s-land between consistent and inconsistent.

First let’s back up a bit, and ask what are the morphisms from the trivial boolean algebra. We’ve been using an ‘order theoretic’ definition of morphism: f is a morphism provided x y implies f(x) ≤ f(y). There’s also an ‘algebraic’ definition: fx) = ¬f(x),  f(xy) = f(x)∧f(y),  f(xy) = f(x)∨f(y). For f:{⊥}→P, these don’t agree! If P is any non-trivial boolean algebra and p is any element of P, the mapping f(⊥)=p is an order-theoretic morphism but not an algebraic one.

JB: Right. It’s probably good to recall why we need morphisms that are just order-preserving, at least at certain points in our definition. We are demanding that a hyperdoctrine give us a boolean algebra B(n) for each n and an order-preserving map B(f): B(m) → B(n) for each function from the n-element set to the m-set… obeying some extra conditions. And one of these extra conditions is that each map B(f) has a left adjoint and a right adjoint. Those maps, those adjoints, will be order-preserving, but we don’t want them to preserve negation. If they did, we’d have

\forall x \neg P(x) \iff \neg \forall x P(x)

and similar for \exists. After all, the left adjoints are supposed to give us universal quantifiers, and the right adjoints existential quantifiers.

MW: Ah, interesting! I didn’t think of that.

But look what happens if we don’t demand that the maps B(f) be full-fledged boolean algebra homomorphisms. It seems we get some pathological scaffolds!

Here’s one: we can take a nice well-behaved hyperdoctrine with all the B(n)’s non-trivial, and replace B(0) with the trivial boolean algebra. (Normally it will be the two-element boolean algebra.) Then for all the empty maps in:∅→{1,…,n}, we let B(in) always be the morphism taking ⊥ to ⊥. Or we let it always be the morphism taking ⊥ to ⊤.

Now if we’re lucky, the extra conditions on a hyperdoctrine will rule these weirdos out. It seems to me it won’t be adjointful. The thing is, in FinSet we have the empty map from ∅ to {x}, say. But we have no map from {x} to ∅. We already know the logical correlatives: ∅ to {x} gives the mapping from ⊤ to a universally true assertion about x, like x=x. Also ⊥ maps to a universally false assertion, like xx. That’s adding a (fictitious) dependence on x.

How do we eliminate a dependence on x, in logic? With the quantifiers! But in our pathological scaffold, although x=x and xare inequivalent, ∀x(x=x) and ∀x(xx) are supposedly equivalent, likewise ∃x(x=x) and ∃x(xx)! Hardly seems kosher!

(Incidently, a while back you called the case of the empty map ∅→{x} “confusingly simple”. It doesn’t seem so bad.)

JB: I just meant that when we’re starting to understand how functions between finite sets give maps between boolean algebras in a hyperdoctine, this is not the first example to look at. Going from one variable to two illustrates the patterns much more clearly than going from none to one. But I agree with you on this: going from none to one is incredibly important, and not really bad!

MW: Anyway, it looks like adjointfulness should rule out one kind of pathology.

JB: I think it does. For any function f between finite sets, the condition that B(f) has both a left and right adjoint implies that it’s a full-fledged boolean algebra homomorphism.

Let’s see why! First let’s show it must preserve ∧ and ∨. This is a special case of a fact from category theory: left adjoints preserve colimits, and right adjoints preserve limits. But let’s just show it directly.

Suppose A and B are posets and the order-preserving map f \colon A \to B is a left adjoint—or in other words, it has a right adjoint g \colon B \to A. Then I claim that f preserves least upper bounds, also known as joins, whenever they exist. That is, whenever a set S \subseteq A has a join \bigvee S we have

f (\bigvee S ) = \bigvee \{ f(a) : \; a \in S \} .

To see this, suppose j = \bigvee S. This implies that a \le j for all a \in S, so f(a) \le f(j), so f(j) is an upper bound of \{ f(a) : \; a \in S\}. We just need to show it’s the least upper bound of this set. So suppose b \in B is any other upper bound of this set. This means that

f(a) \le b

for all a \in S, but thanks to the magic of adjoints this gives

a \le g(b)

so g(b) is an upper bound of S. Since j is the least upper bound we conclude

j \le g(b)

but again, thanks to the magic of adjoints, this gives

f(j) \le b

so f(j) is indeed the least upper bound of \{ f(a) : \; a \in S\}.

This is nice: we’ve shown a left adjoint f preserves joins whenever they exist, so if A and B are boolean algebras it preserves binary joins:

f(xy) = f(x)∨f(y)

and also the least upper bound of the empty set, which is ⊥:

f(⊥) = ⊥.

Dually, if f is a right adjoint is preserves meets whenever they exist, so a right adjoint between boolean algebras obeys

f(xy) = f(x)∧f(y)


f(⊤) = ⊤.

Then comes the coup de gras: if an order-preserving map between boolean algebras is both a left and a right adjoint, it must also preserve negation! The reason is that in any boolean algebra we have

\neg x = \bigvee \{a: \; a \wedge x = \bot \}

so if an order-preserving map between boolean algebras preserves ∧ and ⊥ and all joins, it preserves ¬ as well.

So you see, all that stuff about adjoints is really good for something!

MW: But just demanding all the morphisms be algebraic still leaves one pathological scaffold: let B(0) be the 2-element boolean algebra, and let all the B(n)’s with n>0 be trivial. All the morphisms are obvious. I don’t like this hyperdoctrine either.

JB: Well, maybe we can prove this scaffold isn’t a hyperdoctrine.

Remember, we always have a map B(i):B(0) → B(1) turning 0-ary predicates (‘propositions’) into 1-ary predicates, coming from the injection i of the empty set into the 1-element set. The weird thing about your example is that the proposed B(i) is not injective. That sounds bad to me. Two different nullary predicates shouldn’t give the same unary predicate, at least not in ordinary logic. And remember, our definition of hyperdoctrine is supposed to be capturing all the usual rules of first-order logic.

So what rules, if any, does this example break? As required, f preserves all the boolean algebra operations. Does it really have a left and a right adjoint? I think so: they should be the two different order-preserving maps from the trivial boolean algebra to {⊤,⊥}. After all, left adjoints preserve ⊥ while right adjoints preserve ⊤.

But maybe this thing violates the Beck-Chevalley condition, or the Frobenius law!

MW: Let’s hope so. I don’t feel motivated to check it out, at least right now. (Anyway, we elected to put off worrying about Beck-Chevalley and Frobenius till another day.) On to Gödel’s completeness theorem!

JB: Waaaait a minute. If we can’t do anything as easy as checking that this weird scaffold of yours isn’t a hyperdoctrine, there’s no way we’re gonna do anything hard like reformulating Gödel’s completeness theorem! It’s not like there’s some book or paper where we can read this stuff. We’re making it up. So we need to build our muscles.

And indeed, after spending some insomniac hours thinking about this, it really does seem easy to check this fact, using the Frobenius condition. So let me show you.

The Frobenius condition says that ∃ plays nicely with finite meets. I gave one example of this condition, illustrating how ∃ gets along with binary meets—that is, ˄:

x(P(y) ˄ Q(x)) = P(y) ˄ ∃xQ(x)

But the Frobenius condition also says that ∃ gets along with the nullary meet—that is, ⊤. Here’s the simplest example:

x(⊤) = ⊤

The notation here is a bit confusing: at left we’ve got ⊤ ∈ B(1), but at right we’ve got ⊤ ∈ B(0).

MW: That’s why I like to use x=x for the maximum element (the verum) of B(1).

JB: On the other hand, since ∃x: B(1) → B(0) is defined to be the left adjoint of B(i): B(0) → B(1), it preserves all joins! Left adjoints preserve joins. Among other things, it preserves the nullary join—that is, ⊥. So we also have

x(⊥) = ⊥

None of this is news: it’s true that there exists something that obeys ‘true’, and it’s false that there exists something that obeys ‘false’. But unlike mere mortals, we’re getting these profound insights from the Frobenius condition and adjoints, respectively.

Now for the kicker: suppose we have a hyperdoctrine where the boolean algebra B(1) has just one element, namely ⊤ = ⊥. That gives this equation in B(0):

⊤ = ∃x(⊤) = ∃x(⊥) = ⊥

So the top of B(0) is also the bottom, so B(0) also has just one element.

Thus, your pathological scaffold where B(0) has 2 elements and all the other B(n)’s have just one cannot be a hyperdoctrine!

More generally, we can keep playing this game, using rules like

xyz(⊤) = ⊤

(which come from the Frobenius condition) and

xyz(⊥) = ⊥

(which come from existential quantifiers being left adjoints) to show that in a hyperdoctrine, if any boolean algebra B(n) is trivial, then B(0) must be.

In simple terms: any contradiction in the world of n-ary predicates must trickle down to the world of propositions!

And we’ve already seen the converse: if B(0) is trivial, then so are all the higher B(n)’s, since the inclusion in:∅→n gives a boolean algebra homomorphism from B(0) to B(n).

So I think we now know what an inconsistent hyperdoctrine should be: it’s one for which any boolean algebra B(n), hence all of them, has ⊤ = ⊥.

Whew! Now we really know what ‘inconsistent’ means… so we’re clearly ready to show that any consistent hyperdoctrine has a model. Right?

MW: Right. I’m glad you had the energy to do this.

By the way, didn’t we just construct one of the examples you wanted awhile back: an adjointful functor FinSet to BoolAlg that violates the Frobenius condition?

Anyway, Gödel’s completeness, here we come!

Prev TOC

Leave a comment

Filed under Categories, Conversations, Logic

Epstein Relativity Diagrams

[This post is available in pdf format, sized for small and medium screens.]

Lewis Carroll Epstein wrote a book Relativity Visualized. It’s been called “the gold nugget of relativity books”. I wouldn’t go quite that far, but Epstein has devised a completely new way to explain relativity. The key concept: the Epstein diagram. (I should mention that Relativity Visualized is a pop-sci treatment.)

Continue reading

Leave a comment

Filed under Physics, Reviews

Non-standard Models of Arithmetic 12

Prev TOC Next

JB: It’s been a long time since Part 11, so let me remind myself what we’re talking about in Enayat’s paper Standard models of arithmetic.

We’ve got a theory T that’s a recursively axiomatizable extension of ZF. We can define the ‘standard model’ of PA in any model of T, and we call this a ‘T-standard model’ of PA. Then, we let PAT to be all the closed formulas in the language of Peano arithmetic that hold in all T-standard models.

This is what Enayat wants to study: the stuff about arithmetic that’s true in all T-standard models of the natural numbers. So what does he do first?

MW: First Enayat offers a brief interlude, chatting about the strength of PAT for various T‘s. PAT is stronger than PA, because it can prove Con(PA) and PA can’t. That’s because T extends ZF, and ZF can prove that its \mathbb{N} is a model of PA. ZF would be a pretty poor excuse for a set theory if it couldn’t prove that!

Next: PAZF = PAZFC+GCH, where GCH is the generalized continuum hypothesis. Instead of looking at ZFC+GCH, let’s consider ZFL, i.e., ZF+“every set belongs to Gödel’s constructible universe L”. As Gödel showed, ZFL implies ZFC+GCH, and also Con(ZF) implies Con(ZFL); in fact any model M of ZF contains a so-called inner model LM of ZFL. Now, ordinals are absolute between transitive models of ZF; not only that, but ω is absolute, and the ordinal operations are absolute. (“We don’t have to search all over the kingdom to do arithmetic with finite ordinals.”) So M and LM have the same “\mathbb{N}” as their “standard model of PA”.

Since M and LM have the same “standard model of PA”, the same arithmetic statements hold in both of them. To see if some φ belongs to PAZF, we have to verify the truth of \varphi^\mathbb{N} in all models of ZF. But in the process, we also verify that truth in all inner models LM of ZFL. And when we’ve finished that task, we’ve actually verified that truth in all models of ZFL, just because every model of ZFL is the inner model of at least one model of ZF—namely itself! So PAZF = PAZFL.

Contrast this with ZFI, i.e., ZF+“there is an inaccessible cardinal”. PAZFI is strictly stronger than PAZF. Here’s why: ZFI proves Con(ZF), since if we gather all the sets of rank below the first inaccessible cardinal and toss them in a bag, we’ve got a model of ZF! ZF does not prove Con(ZF), by Gödel’s second incompeteness theorem. So Con(ZF) is our example of a statement in L(PA) that PAZFI delivers and PAZF doesn’t.

JB: By the way, just as a stupid little side note, you’re assuming here that ZF is consistent. But I’m willing to keep that as a background assumption in all these deliberations. (Maybe you already mentioned this and I forgot!)

MW: Yup. Kind of hard to talk about the omegas of models of ZF, if there are no models of ZF!

You might wonder what’s the chief difference between the two cases, ZFI and ZFL. Models of ZF contain inner models of ZFL, and models of ZFI contain inner models of ZF. Ah, but there are models of ZF not contained in any model of ZFI (at least if ZF is consistent). Namely, if ZF is consistent, then ZF+¬Con(ZF) is also consistent, as we’ve just seen. So it has a model. I leave it as an exercise that a model of ZF+¬Con(ZF) can’t possibly be contained in a model of ZFI.

JB: Nice!

Prev TOC Next

Leave a comment

Filed under Conversations, Peano arithmetic

Algorithmic Information Theory

Back in the 60s, Kolmogorov and Chaitin independently found a way to connect information theory with computability theory. (They built on earlier work by Solomonoff.) Makes sense: flip a fair coin an infinite number of times, and compare the results with the output of a program. If you don’t get a 50% match, that’s pretty suspicious!

Three aspects of the theory strike me particularly. First, you can define an entropy function for finite bit strings, H(x), which shares many of the formal properties of the entropy functions of physics and communication theory. For example, there is a probability distribution P such that H(x)=−log P(x)+O(1). Next, you can give a precise definition for the concept “random infinite bit string”. In fact, you can give rather different looking definitions which turn out be equivalent; the equivalence seems “deep”. Finally, we have an analog of the halting problem: loosely speaking, what is the probability that a randomly chosen Turing machine halts? The binary expansion of this probability (denoted Ω by Chaitin) is random.

I wrote up my own notes on the theory, mostly to explain it to myself, but perhaps others might enjoy them.

Leave a comment

Filed under Logic

Review: Smullyan & Fitting, Set Theory and the Continuum Problem

The first sentence of Pollard’s review sums up my feelings perfectly: “This rewarding, exasperating book…” On balance, I found it more exasperating than rewarding. But it does have its charms.

I participated in a meetup group that went through the first two parts of S&F. My fellow participants possessed considerable mathematical knowledge and sophistication, but had only slight prior acquaintance with mathematical logic and none with axiomatic set theory. (The opinions here are strictly my own, but they reflect my experience in the meetup.) If I had just skimmed the book, glancing at familiar material, I would probably have a more positive impression.

I wrote an extensive set of notes for the meetup. This post is basically the last section of those notes.

I will begin with the book’s minuses, so as to end on a positive note.

Continue reading

Leave a comment

Filed under Logic, Reviews

Stirling’s Formula: Ahlfors’ Derivation

If you’re reading this blog, you probably know Stirling’s formula:

n! \sim \sqrt{2\pi n}\left(\frac{n}{e}\right)^n

It’s not hard to estimate n! to within a factor of √2; I wrote up a note on this and even easier derivations. It’s quite a bit harder to show that the ratio of the two sides approaches a definite limit as n→∞ and that this limit is 1. You can find a variety of proofs in a number of places, one being Ahfors’ Complex Analysis.  I wrote up a note about this too, expanding on some of the details.

Incidentally, the two sides are asymptotic not just for positive integers n. Replace n! with Γ(z+1) on the left, and both n‘s with z‘s on the right. Allow z to go to infinity in the complex plane, while staying at least a fixed finite distance to the right of the imaginary axis. Then the two sides remain asymptotic. Ahfors proves this stronger result, and uses it to derive the integral form for the Γ function.

Note that if you replace the n‘s with z‘s, you have zz on the right. So you’ve got to worry about branches of the complex logarithm (since zz is defined as ez log z). The note deals with this (and other things).

1 Comment

Filed under Analysis

First-Order Categorical Logic 4

Prev TOC  Next

MW: I made up a little chart to help me keep all these adjoints straight:

Continue reading

Leave a comment

Filed under Categories, Conversations, Logic