First-Order Categorical Logic 1

TOC Next

(A conversation between John Baez and Michael Weiss.)

JB: Okay, maybe it’s a good time for me to unleash some of my crazy thoughts about logic. They’ve been refined a lot recently, thanks to all the education I’ve been getting from you and folks on the n-Category Café. So, I can actually start with stuff that’s not crazy at all… although it may seem crazy if you’re not used to it.

I’ll start with some generalities about first-order classical logic. (I don’t want to get into higher-order logic or intuitionistic logic here!) The first idea is this. In the traditional approach, syntax and semantics start out living in different worlds. In categorical logic, we merge those worlds.

Recall the traditional approach. A first-order theory is a set of axioms, sentences that can be written down using logical symbols and a chosen bunch of n-ary predicate symbols. (We can include function symbols if we want, but they’re not strictly necessary so let me leave them out for now.) All this lives in the world of syntax: essentially, scribbles on paper.

On the other hand, a model is a choice of a set V (pompously called the ‘universe’), a subset of V^n for each n-ary predicate symbol. All this lives in the world of semantics: we are working in some world of sets. Which world of sets? This may be left vague, or we can try to specify it.

If we try to specify it, we usually do this using a first-order theory, like ZF! So then we’re back in the world of syntax.

This is a bit confusing, at least for nonexperts. For example, we have a theory called PA that’s talking about arithmetic, and we can prove things from the PA axioms. All this is syntax. If we want to understand what all this syntax really means, we can look at a model in the world of sets. If we get nervous about which world of sets we’re using, we can specify it using another theory, like ZF. But now we’re back in the world of syntax. If we want to know what all this syntax really means, we can again look at a model of ZF in the world of sets—and not necessarily the same world of sets we were just talking about. To specify this world of sets we can use another theory… or maybe even the same one again. It’s a bit weird: not exactly circular, but certainly a bit disorienting at first.

Lawvere clarified all this stuff immensely in his 1963 thesis Functorial Semantics of Algebraic Theories, and subsequent work. At least, he clarified it for those who understood what he was saying!

His approach applies to many kinds of logic, not just first-order classical logic, and his thesis illustrated it in a stripped-down form of logic called ‘universal algebra’. But never mind: the idea is to treat both syntax and semantics as things of the same kind—that is, objects of the same category.

A theory T, in this view, consists of all the predicates that we can write down using our chosen predicate symbols and logical symbols… modulo the relation that two predicates P(x_1, \dots, x_n) and Q(x_1, \dots, x_n) count as ‘the same’ if we can prove

P(x_1, \dots, x_n) \iff Q(x_1, \dots, x_n)

using our axioms. For each n we get a set T(n) of these predicates, or really equivalence classes. This set T(n) forms a Boolean algebra under the usual operations of logic, like ‘and’ and ‘or’ and ‘not’.

A context, let me call it, is determined by a set V. Then, for each n, we get the collection P(V^n) of all subsets of V^n. This set P(V^n) forms a Boolean algebra under the usual operations of set theory, like ‘intersection’ and ‘union’ and ‘complement’.

So, you see that a theory and a context are things of the same ilk: a choice of Boolean algebra for each natural number n. Let’s call such a thing a ‘polyadic Boolean algebra’. Polyadic Boolean algebras have more structure than just this: they’re not really just a big list of Boolean algebras; they have to fit together somehow. But let’s not worry about that yet. The key point is that a model can then be defined as a map from one polyadic algebra to another! A model of the theory T in the context V is a map of polyadic Boolean algebras that maps elements of each T(n) into elements of each P(V^n)… while preserving all the structure of a polyadic Boolean algebra.

Does this make sense? The essence of a ‘model’ is that that it’s a map from expressions to their meanings… but the expressions and their meanings have now been organized into gadgets of the same nature—polyadic Boolean algebras—so we can begin to do cool tricks like map meanings to expressions, or compose these maps. It makes logic more like ordinary algebra, where you can map groups willy-nilly to other groups, via homomorphisms.

MW: This is intriguing. I have a few “hot takes”. (a) Reminds me a little of Abraham Robinson’s method of diagrams, one of the signal techniques in model theory. Idea being to capture enough info about a model M in a theory DM, so that any model of DM contains a substructure isomorphic to M.  (b) On the logic side, the first order of business is to see what the “all time greatest hits” of model theory—completeness, compactness, Löwenheim-Skolem—have to say about this. (I have guesses.) (c) Conversely, on the category side, how about the usual suspects—monics and epics, functors, natural transformations, adjoints. How do they weigh in?

But prior to all that, two questions. (1) “But let’s not worry about that yet.” I worry! Quantifiers hang out in that additional structure, right? You don’t get far in model theory without quantifiers! (2) You’re building a category, yes? The objects are the polyadic boolean algebras, and the morphisms are what you’d expect. As opposed to each polyadic boolean algebra being a category somehow, and interpretations being functors.

JB: Wow, three remarks and two questions! This is truly a hydra-headed comment. Unlike Hercules, I won’t tackle them all, just a few of my favorites.

You ask what the “all time greatest hits” of model theory—completeness, compactness, Löwenheim-Skolem—have to say about polyadic Boolean algebras. This is exactly the sort of thing I’d like to investigate, as soon as we’ve nailed down what a polyadic Boolean algebra really is. This should be lot of fun.

For example, I suspect that the compactness theorem will follow from some certain topological spaces being compact! Why? Because there’s a correspondence between Boolean algebras and Stone spaces, which are compact Hausdorff spaces of a special sort. Indeed, the category of Boolean algebras and Boolean algebra homomorphisms is just the opposite of the category of Stone spaces and continuous maps! This is called the Stone representation theorem. So, when we say that a polyadic Boolean algebra is an infinite list of Boolean algebras, all fitting together in a certain way, we could equivalently say it’s an infinite list of Stone spaces, all fitting together in a certain way. Topology and logic are two sides of the same coin.

You also ask how all the usual concepts of category theory will shed light on first-order logic. This too will be huge fun to investigate!

And by the way, when I say this subject will be fun to investigate, I really mean it. I don’t know huge piles of papers on this stuff. Categorical logicians tend to go straight to higher-order intuitionistic logic, with just a quick backward glance at first-order classical logic. I surely haven’t read everything I should. I’ll try. But still, I think it should be pretty easy to find new territory to explore around here.

And just to get you in the right mindset, eventually we’ll see how a Boolean algebra is a category of a special kind. But that can wait till later.

But we’re getting way ahead of ourselves. First we need to wrap our heads around polyadic Boolean algebras. And you’re right: the first thing we need to understand is how a polyadic Boolean algebra is more than merely a Boolean algebra B(n) for each natural number n, whose elements we think of as n-ary predicates. We need to talk about how these Boolean algebras fit together.

Your instinct is right: this is where quantifiers come in. Can you guess how? What do quantifiers do to n-ary predicates? More generally, what are the basic ways to get m-ary predicates from n-ary ones, in classical first-order logic?

MW: Ages ago, Rasiowa and Sikorski gave topological proofs of the Gödel completeness and the Löwenheim-Skolem theorems (Fundamenta Mathematicae, 1950 (v.37 pp.193–200) and 1951 (v.38 pp.230-232)). Truth be told, those are papers I always Intended To Read (even though they’re quite short). Maybe because I have a pretty good guess at how it works. Let’s just look at the statement of the compactness theorem. Let points be structures, and let a basic closed set be all the structures satisfying a given closed formula (aka sentence). A collection of sentences has a model iff every finite subset has a model—this translates to, a collection of closed sets has non-empty intersection iff it has the finite intersection property, i.e., the space I’ve just outlined is compact. (Of course, to pump this up to proofs of the “all time greatest hits” would take more work. Minor point: you’d have to do some futzing to avoid proper classes, at least if you want to stay inside the climate-controlled rooms of ZFC.)

Rasiowa and Sikorski also had the connection with Boolean algebras. But the typical textbook proof of Gödel’s completeness theorem trots out the Lindenbaum algebra (which is just your B(0), I believe) , so that’s not so novel.

But this doesn’t quite correspond to the topological spaces that go with your tower of Boolean algebras B(n)—those bear a family resemblance to the Zariski topologies of algebraic geometry.

OK, quantifiers. First thing that pops into my head, existential quantification goes hand in hand with projection. This is the basis for the Kleene hierarchy in recursion theory, in turn inspired by the hierarchies of descriptive set theory. (Also I think why Tarski called his algebraization of predicate calculus “cylindric algebras”.) So we have projection functions \pi^n_i from B(n) to B(n-1), with \pi^3_2 (for example) mapping [\varphi(x_1,x_2,x_3)] to [\exists x_2\;\varphi(x_1,x_2,x_3)]. I’m using [] to stand for “equivalence class of”, naturally.

The algebraic axioms we need should fall right out of the laws of predicate calculus. For example,

\exists x(\varphi(x,\vec{y})\vee\psi(x,\vec{y}))\leftrightarrow \exists x\,\varphi(x,\vec{y})\vee\exists x\,\psi(x,\vec{y}),

so

\pi^n_i(p\vee q)=\pi^n_i(p)\vee\pi^n_i(q).

For conjunctions, things are not quite as nice:

\exists x(\varphi(x,\vec{y})\wedge\psi(x,\vec{y}))\rightarrow\exists x\,\varphi(x,\vec{y})\wedge\exists x\,\psi(x,\vec{y}),

but the reverse implication doesn’t hold. So we only get an inequality:

\pi^n_i(p\wedge q)\leq\pi^n_i(p)\wedge\pi^n_i(q).

On the other hand, universal quantification plays nicely with conjunctions. And we can define universal quantification via this formula: \neg\pi^n_i(\neg p).

Hmm, notation? Borrowing from the hierarchies, I’d be tempted to use \sigma^n_i for existential quantification, \pi^n_i for universal. But I’ve already used up \pi^n_i! Anyway, we should get another formula expressing the way \forall distributes over \wedge.

Any others? I’m probably missing a few minor ones, like \pi^n_i(\bot)=\bot.

Just for the record, I’ll throw out that I haven’t forgotten all the category theory I learned while going through Leinster’s book. I remember that the left and right adjoints to a projection function are existential and universal quantification, and I sort-of remember what that means. I hope we’re not going there right away, though.

JB: That’s exactly where I’m trying to go. The slick way to define these polyadic Boolean algebras—or the nobler things they are struggling to become—uses adjoints. But don’t worry, we’ll do it slow enough that it makes sense.

You’re right: if B(n) is our Boolean algebra of n-ary predicates, existential quantification over the i-th variable, where 1 \le i \le n, gives a map from B(n) to B(n-1). Universal quantification gives another such map. We can write down all the laws these maps satisfy: the ways they interact with the Boolean algebra operations ‘and’, ‘or’ and ‘not’. You’ve got most of them—you can probably do this better than I can! But it would be nice to get them in a more systematic way.

If we’re trying to understand all these laws systematically, it would be tragic if we failed to notice that there are a bunch of other important maps between these Boolean algebras B(n), coming from ‘substitution of variables’.

For example, if I have a 3-ary predicate P(x_1,x_2,x_3), I can turn it into a 2-ary predicate in various ways, like P(x_1,x_1,x_2) or P(x_2,x_1,x_1). If you think about it, we get one such way for each map from {1,2,3} to {1,2}. See how? So, there’s something about functions between finite sets going on here.

MW: I see! You’re generalizing the diagonal. Hmm, Mr. Diagonal is a big-shot in category theory, as I recall.

JB: Right! Let’s follow this clue. The maps I just used, from {1,2,3} to {1,2}, are surjections. An interesting degenerate case of a surjection is a bijection. We can use bijections from {1, 2,…,n} to itself to permute variables. For example, given a 3-ary predicate P(x_1,x_2,x_3) I can get others like P(x_3,x_1,x_2).

What about injections? Consider an example: there’s an injection f \colon \{1,2\} \to \{1,2,3\} with f(1) = 1, f(2) = 3. We can use this to turn the 2-ary predicate P(x_1,x_2) into the 3-ary predicate P(x_1,x_3).

MW: That’s not a 3-ary predicate!

JB: Ah, but we can think of it as a 3-ary predicate. It’s a predicate of x_1,x_2,x_3 that just happens to not depend on x_2. It should remind us of how we can have a function of 3 variables that depends trivially on some of the variables, like f(x,y,z) = xz.

MW: Hmm, true.

JB: Putting all these ideas together and going a bit further: each function f \colon \{1,\dots,m\} \to \{1,\dots,n\} gives a Boolean algebra homomorphism from B(m) to B(n). We just use f to rename variables. It makes sense to call these homomorphisms

B(f) \colon B(m) \to B(n)

since they’re just as fundamental a part of our polyadic Boolean algebra as the B(n)’s. And it’s easy to see that

B(fg) = B(f) B(g)

whenever we can compose f and g. So we’re getting quite a nice bunch of operations on our Boolean algebras, even before we include quantifiers. Do you see the slick way to express what we’ve got here?

MW: It’s a pullback. (I don’t mean a pullback square, just a plain old pullback.) An n-ary predicate is a function from Vn to {true, false}. A point in Vn is an n-tuple (x1,…,xn), which we can think of as a function from {1,…,n} to V, thus: x \colon i\mapsto x_i. So if we have a function f\colon \{1,\ldots,m\}\rightarrow\{1,\ldots,n\}, we form the composition

\{1,\ldots,m\}\xrightarrow{f}\{1,\ldots,n\}\xrightarrow{x}V^n\xrightarrow{P}\{\text{true},\text{false}\}

with P being the n-ary predicate.

JB: Hmm, I guess you’re using V to mean the ‘universe’. You’re treating n-ary predicates as functions from Vn to {true, false}. There’s a Boolean algebra B(n) of all these predicates. This is isomorphic to the one I mentioned a while back, consisting of all subsets of B(n). And you’re showing how any function

f\colon \{1,\ldots,m\}\rightarrow\{1,\ldots,n\}

gives a map

B(f) \colon B(n) \to B(m)

There’s something good about this, but also something strange. First, you’re getting a function B(f) \colon B(n) \to B(m), while I was talking about a function B(f) \colon B(m) \to B(n). This business always throws me for a loop.

Second, this set V is not a fundamental part of our formalism! Remember: in a polyadic Boolean algebra, n-ary predicates are really just elements of a Boolean algebra B(n). This Boolean algebra might be the Boolean algebra of all functions f \colon V^n \to \{\textrm{true},\textrm{false}\}. That’s a possibility. But it might not: for example, it might be a Boolean algebra of n-ary predicates written in some language. Both examples are equally important: one is good for syntax, the other for semantics. There are also lots of other possibilities. The reason for introducing polyadic Boolean algebras is to put all these possibilities on an equal footing.

My question was about all these possibilities at once:

It makes sense to call these homomorphisms

B(f) \colon B(m) \to B(n)

since they’re just as fundamental a part of our polyadic Boolean algebra as the B(n)’s. And it’s easy to see that

B(fg) = B(f) B(g)

whenever we can compose f and g. So we’re getting quite a nice bunch of operations on our Boolean algebras, even before we include quantifiers. Do you see the slick way to express what we’ve got here?

Nothing about any ‘universe’ here! The answer is visible directly from the question, with no reference to anything else—but only if you put on your category theorist goggles! Do you want another crack at it, or should I just say what I’m fishing for?

MW: Oh, you just mean it’s a functor. I thought you wanted a characterization that would tell you what functor it was.

JB: Right, it’s a functor! I don’t want to know what functor it is, because that’s impossible at this stage: we’re studying a completely general polyadic Boolean algebra, so it could be practically anything. I’m trying to take the definition of ‘polyadic Boolean algebra’ and wrangle it into its modern category-theoretic form. So far it looks like a functor B from the category of finite sets \{1,\dots,n\} and functions between these to the category of Boolean algebras. But this isn’t all. For one thing, we need to get universal and existential quantifiers back into the game. As you hinted, we’ll do this using adjoints.

But first: it’s a bit silly to work only with finite sets of the form \{1,\dots,n\}. These are called finite ordinals, as you well know. But every finite set is isomorphic to one of these, so the category with

  • finite ordinals as objects and functions between them as morphisms

is equivalent to the category FinSet, with

  • finite sets as objects and functions between them as morphisms.

Category theorists consider equivalent categories ‘the same for all practical purposes’, so we can use whichever one we want. FinSet is more flexible, since it lets us use any finite set as a set of names of the variables in our predicates.

So, this is what our polyadic Boolean algebra looks like so far:

B \colon \textrm{FinSet} \to \textrm{BoolAlg}

A functor from the category of finite sets to the category of Boolean algebras! Terse, clean, simple.

But we still need to get those existential quantifiers into the game. And we should figure out why you came up with a contravariant functor, one that turns a function f \colon \{1,\dots,m\} \to \{1,\dots,n\} into a Boolean algebra homomorphism B(f) \colon B(n) \to B(m), instead of covariant functor, which gives B(f) \colon B(m) \to B(n). About 30% of the time I think a contravariant functor is what we really want in this game, but 70% of the time I’m sure we need a covariant one. My long discussion of how functions give Boolean algebra homomorphisms was supposed to settle this question:

For example, if I have a 3-ary predicate P(x_1,x_2,x_3), I can turn it into a 2-ary predicate in various ways, like P(x_1,x_1,x_2) or P(x_2,x_1,x_1). If you think about it, we get one such way for each map from {1,2,3} to {1,2}. See how? So, there’s something about functions between finite sets going on here.

Now I’m confused again. Maybe I just need more coffee.

[More coffee.]

Oh, I see. If we use your method correctly we get a covariant functor, as I hoped. For any finite set of variables S we want a Boolean algebra B(S). You’re getting this from a set V as follows:

B(S) = \{\textrm{true},\textrm{false}\}^{V^S}

That is, you’re using functions from V^S to truth values to describe subsets of V^S.

And here’s the deal: the functor sending any finite set S to the set V^S is contravariant, using pullbacks: any function f: S \to T gives a function from V^T to V^S. This uses a pullback, just as you said: given p \in V^T this gets mapped to p \circ f \in V^S.

For the same reason, the functor sending any set X to the set \{\textrm{true},\textrm{false}\}^X is also contravariant. Composing these two contravariant functors we get a covariant functor

B \colon \textrm{FinSet} \to \textrm{BoolAlg}.

That is, any function f \colon S \to T gives a Boolean algebra homomorphism B(f) \colon B(S) \to B(T).

You had me fooled for a while there! You said you were doing a pullback, but actually you were doing two. Two wrongs don’t make a right, usually, but two contravariant functors make a covariant one.

TOC Next

Leave a comment

Filed under Categories, Conversations, Logic

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.