Category Archives: Math

First-Order Categorical Logic 13

Prev TOC Next

MW: It’s been a minute! Well, almost 60,000 minutes.

We left off with a question: does a natural transformation from a syntactic hyperdoctrine to a semantic hyperdoctrine automatically “respect quantifiers”? We saw that this amounts to a Beck–Chevalley condition. We wondered if we had to add that condition to our definition of a model, or if it came for free.

Maybe you’ve had the experience of putting aside a crossword, half-completed, and coming back to it in a hour. Hey, of course the answer to 60 Across is “Turing Test”!

The same thing happened to me here. And the answer is: no free lunch.

Let me recap, just to get my brain back up to speed, and fix notation. B is a syntactic hyperdoctrine, C is a semantic hyperdoctrine, and F:BC is a natural transformation.

For example: say B is the theory LO of linear orders, and C is an actual linear order. B(2) contains predicates like [x<y], C(2) contains binary relations on the domain of the linear order, and F2([x<y]) is the less-than relation on this domain.

Here’s the key diagram:

Ignore the dashed arrows for just a second. Say f is an injection in FinSet, like f:{x}→{x,y}. Then B(f) and C(f) are the “throw in extra variables” morphisms we’ve talked about ad nauseum. This diagram is a commuting square in the category BoolAlg.

The dashed arrows are morphisms in a larger category we dubbed PoolAlg. This is a full subcategory of Poset, and BoolAlg is a so-called wide subcategory of it. Its objects are all boolean algebras, and its morphisms are all order-preserving maps, i.e., all the morphisms those objects have as denizens of Poset.

Within PoolAlg, the dashed arrows are left adjoints of the right arrows. Now does the diagram commute? In PoolAlg, that is. Beck–Chevalley says yes. We saw last time that we want this to be true. For example, the top left arrow takes [y<x] to [∃y(y<x)]. We want this predicate to map to the corresponding relation on an actual linear order. That’s what we mean by “respecting quantifiers”.

JB: This sounds right. Thanks for getting us started again!

As you might expect, I have a couple of nitpicks. While I feel sure there’s no free lunch here, I don’t think you have proved it. Maybe for some reason the Beck–Chevalley condition always holds in this situation! I feel sure it doesn’t, but I think that can only be shown with a counterexample: a defective model that doesn’t obey Beck–Chevalley.

It’s probably easy enough to find one. However, I don’t feel motivated to do it. We have bigger fish to fry. I’m happy to assume our models obey this Beck–Chevalley condition.

(Do we also need to assume they obey some sort of Frobenius condition?)

And here’s another even smaller nitpick: I don’t see the need for this category PoolAlg. I believe whenever you’re tempted to use it we can use our friend Poset. For example, the dashed arrows are left adjoints in Poset, and the square containing those dashed arrows commutes in Poset. Saying PoolAlg—restricting the set of objects in that way—isn’t giving us any leverage. It doesn’t seriously hurt, but I prefer to think about as few things as is necessary.

Maybe it’s time to finally try to state and prove a version of Gödel’s completeness theorem. Do you remember our best attempt at stating it so far? I think I can, just barely… though it’s somewhat shrouded by the mists of time.

MW: That’s right, to prove the “no free lunch” result we need a counter-example. That’s what came to me when I started thinking about this stuff again. A way to construct a whole slew of counter-examples.

And I think it’s worth going through, because it relates to Henkin’s proof of Gödel’s completeness theorem. The cat-logic proof will have to surmount the same obstacles. So here goes.

Say B is a theory, aka syntactic hyperdoctrine. Say C is a semantic hyperdoctrine, and F:BC is one of those natural transformations that does respect quantifiers. Suppose the domain of C is V: C(n) is a set of n-ary relations on V. Now let W be a subset of V. Let D(n) be all the relations you get by restricting the relations in C(n) to W. And for any predicate φ∈B(n), let Gn(φ) be the restriction of Fn(φ) to W. Let be the image of B(n) under Gn. I claim that D is a hyperdoctrine, and G is a natural transformation from B to D. And very often, G will be disrespectful of quantifiers.

Using my LO example, let C have V=ℤ as its domain, and let W=ℕ. The predicate [y<x] gets sent to a less-than relation on ℤ by F2. This restricts to a less-than relation on ℕ. So G2([y<x]) is that relation.

Now let’s apply the left adjoints of the injection {x}→{x,y}. Up in B, we get the predicate [∃y(y<x)]. F1 sends this to the always-true relation in ℤ, which of course restricts to the always-true relation in ℕ. What about the left adjoints in C and D? The relation supy(y<x) is always true in ℤ, but is x≠0 in ℕ. So “go left, then down via G1’’ gets you to always-true in D, while “go down via G2, then left” gets you to x≠0.

Another example: let V=ℚ, W=ℤ, and for our predicate, use [x<z<y]. The left adjoint from {x,y}→{x,y,z} is [∃z(x<z<y)]. Taking one path brings us to the relation x<y in ℤ, while the other path leads to “follows, but not immediately”.

It’s clear now that one path leads to a “model” where ∃x means, “does there exist an x in a larger domain?”. The Henkin proof features a whole sequence of such enlargements. Thinking about that suggested this class of counter-examples to me.

I believe that restricting a semantic hyperdoctrine this way results in a hyperdoctrine. I haven’t checked all the details. But it’s a piece of cake when C(n) equals P(Vn) for all n. In that case, D(n) is just P(Wn)! This is what you wanted for semantic hyperdoctrines in the first place, as I recall.

The other thing we need to check: that G is a natural transformation. The key
diagram:

Here, ↾m and ↾n are the restriction maps. It’s pretty obvious that the bottom square commutes. Since the top square does too, an easy diagram chase shows that “outer frame” square does.

Anyway, you wanted a refresher on the statement of the Gödel completeness theorem in our framework. Here goes. A syntactic hyperdoctrine B is consistent if B(0) is not the one-element boolean algebra {⊤=⊥}. The theorem states that every consistent syntactic hyperdoctrine has a model, which is a natural transformation to a semantic hyperdoctrine that respects quantifiers.

(And maybe also equality? Haven’t thought about that yet.)

We want to adapt the Henkin proof to this framework, is that right? I think I see, in a vague way, how to do that. But I don’t see how we’d be leveraging the framework—how the proof would be anything but a straighforward translation, not really using any of the neat category ideas.

JB: Okay, great. Thanks for all that.

Let’s try to clean things up a wee bit before we dive in. When you say “syntactic hyperdoctrine” I’d prefer to just say “hyperdoctrine”.

First, it seems plausible that any hyperdoctrine is consistent iff has a model. Second, part of the whole goal of working categorically is to break down the walls between syntax and semantics, to treat them both as entities of the same kind (in this case hyperdoctrines).

So, we should aim for a version of Gödel’s completeness theorem saying that a hyperdoctrine B is consistent iff it has a morphism F to a hyperdoctrine C of some particular “semantic” sort. You said B should be “syntactic”. That’s great as far as it goes, but here’s a place where we can generalize and do something new—so let’s boldly assume B is an arbitrary hyperdoctrine.

MW: Wait a minute! What do you mean by “morphism” here?

JB: A morphism of hyperdoctrines. Do you object? (That was a pun.)

MW: Who could object to a three layer cake?

In the bottom layer are boolean algebras, regarded as poset categories. The middle filling consists of categories whose objects are boolean algebras; a hyperdoctrine is a functor from FinSet to one of them. And now you propose to top it off with a category whose objects are hyperdoctrines. Copacetic!

Anyway, we never defined a morphism of hyperdoctrines. Do you mean a natural transformation respecting quantifiers?

JB: Well, perhaps not exactly that. We certainly want F to be a natural transformation respecting quantifiers, but we probably want it to be more than that, and we haven’t yet figured out exactly what. I expect that maybe F should be a natural transformation whose components obey the Beck—Chevalley and Frobenius conditions. That should make it respect quantifiers and equality. But we really need to think about this harder before we can be sure! So I said “morphism”, to leave the issue open.

We will probably resolve this issue as part of proving Gödel’s completeness theorem. It’s a time-honored trick in math: you make up the definitions while you’re proving a theorem, so your definitions make the theorem true.

Okay, where was I? I wanted to think about Gödel’s completeness theorem this way: it says a hyperdoctrine B is consistent iff it has some morphism F to a hyperdoctrine C of some “semantic” sort. And by “semantic” let’s mean the most traditional thing: we’ll demand that C be a hyperdoctrine where we pick a set V and let C(n) be the power set of Vn. That is semantic par excellence.

So how are we going to do this? Whatever trick people ordinarily use to prove Gödel’s completeness theorem—and you already mentioned the key word: “Henkin”—let’s try to adapt it to a general hyperdoctrine B. Maybe we can sort of pretend that B is defined “syntactically”, but try to use only its structure as a hyperdoctrine.

MW: Okay, this is the sort of thing I was hoping for!

I’m intrigued. The Henkin proof leans heavily on syntax. It begins by adding a bunch of new constants and axioms. How can we do this, while forswearing syntax?

JB: A question for the next installment….

Prev TOC Next

Leave a comment

Filed under Categories, Conversations, Logic

Set Theory Jottings 10. Axiomatic Set Theory

Prev TOC Next

“An Axiom, you know, is a thing that you accept without contradiction. For instance, if I were to say ‘Here we are!’ that would be accepted without any contradiction, and it’s a nice sort of remark to begin a conversation with. So it would be an Axiom. Or again, supposing I were to say, ‘Here we are not!’, that would be—”

“—a fib!” cried Bruno.

“that would be accepted, if people were civil”, continued the Professor; “so it would be another Axiom.”

“It might be an Axledum”, Bruno said: “but it wouldn’t be true!

—Lewis Carroll, Sylvie and Bruno Concluded

To get to the “good stuff” in math, you almost always need some set theory. Zermelo-Fraenkel set theory (ZF), plus the axiom of choice (AC; ZF+AC=ZFC) has become the standard first-order axiom system for set theory.

Before diving into the details, some generalities on axiom systems. Nowadays we’re pretty chill about them; you can take any collection you like (hopefully consistent) for a theory, and then you can start writing your thesis. Not, perhaps, an interesting thesis, but at any rate Bruno won’t complain that your axioms aren’t true!

For the Greeks, the axioms and postulates were true, in some sense. Idealized, sure, but descriptive of reality. This tie began to fray with the discovery of non-Euclidean geometries. Algebraic axiom systems, like those for groups and for fields, appear by the end of the 19th century.

For roughly two thousand years after Euclid, most math developed without axioms. Take calculus as an example. You have the rules of calculus, but you don’t see anything like the Euclidean treatment of geometry. This remained true even as people subjected its foundations to stricter and stricter scrutiny. Mathematical intuition reigned supreme.

Hilbert’s Grundlagen der Geometrie (Foundations of Geometry, 1899) pushed towards a more formalist attitude. A celebrated quote of his, from years earlier, sums it up nicely:

One must be able to say at all times, instead of points, lines, and planes: tables, chairs, and beer mugs.

At times Cantor seemed to endorse this perspective:

Mathematics is entirely free in its development, and its concepts are only bound by the necessity of being consistent, and being related to the concepts introduced previously by means of precise definitions.

Grundlagen einer allgemeinen Mannigfaltigkeitslehre (Foundations of a general theory of sets)

But he held strong opinions on what’s true in mathematics:

I entertain no doubts as to the truths of the tranfinites, which I recognized with God’s help and which, in their diversity, I have studied for more than twenty years; every year, and almost every day brings me further in this science.

—Letter from Cantor to Jeiler, quoted in Dauben (p.147).

On the other hand, he referred to the “Cholera-Bacillus of infinitesimals”, and called them “nothing but paper numbers!” (Dauben, p.131). The Continuum Hypothesis was for him a question of fact.

Two other themes run through this period: mathematics as a mental activity, and as logic.

Recall that Boole titled his famous treatise An Investigation of the Laws of Thought: on Which are Founded the Mathematical Theories of Logic and Probabilities. Cantor’s definition of “set” in his last major work reads

By a set we are to understand any collection into a whole M of definite and separate objects m of our intuition or our thought.

Here is the first sentence of Dedekind’s Was sind und sollen die Zahlen?: “In what follows I understand by thing every object of our thought.” His proof of the existence of an infinite set relies on this ontology:

Theorem: There exist infinite systems.

Proof: My own realm of thoughts, i.e., the totality S of all things which can be objects of my thought, is infinite. For if s signifies an element of S, then the thought s′, that s can be an element of my thought, is itself an element of S

[Dedekind then appeals to his definition of infinite as having a bijection with a proper subset.]

Frege severely criticized this injection of psychology into mathematics. Cantor’s “proof” of the Well-Ordering Theorem suffers from it, as it consists of successively choosing elements of the set to be well-ordered. If we take this literally, then the choices must take place at an increasing sequence of times t1<t2<…. This limits us to ordinals that are “realizable in ℝ’’, and thus to countable ordinals (see post 4). Yet Cantor claimed that every set can be well-ordered, in particular ℝ.

This is why Zermelo was at pains to say in his second proof of the Well-Ordering Theorem, “…the ‘general principle of choice’ can be reduced to the following axiom, whose purely objective character is immediately evident.” (My emphasis.)

Both Frege and Russell held that the truths of mathematics are logical facts. Thus we find debates on whether Zermelo’s axiom of choice is logically valid. Not surprising, historically. Aristotle’s logic dealt with propositions. From “proposition” we obtain “propositional function”, that is, a proposition with a free variable, like “x is mortal”. It becomes a proposition if we assign a value to the variable (“Socrates is mortal”), or quantify over it (“All men are mortal”). The class of all things satisfying a propositional function went by the name, “extension of a concept”.

Zooming out from these specifics, logic and mathematics both lay claim to necessary truth. This is elaborated in Kantian philosophy. Kant classified mathematical facts as synthetic a priori: necessary truths that go beyond analytic truth, which are true by definition. Poincaré classified the Axiom of Choice as a synthetic a priori judgment, just like the principle of induction.

The rise of formal logic and axiomatic set theory resulted in a sharply drawn boundary between logic and set theory. We have the axioms and rules of inference of first-order logic; then we have the axioms of ZFC or similar systems, which are particular first-order theories. Things weren’t so clear at the dawn of the 20th century.

Prev TOC Next

Leave a comment

Filed under History, Set Theory

Set Theory Jottings 9. Cantor Normal Form

Prev TOC Next

Suppose β>1, and let ζ>0 be arbitrary. Then ζ has a unique representation in so-called Cantor normal form:

ζ α1 χ1+···+βαk χk
α1>···>αk, 1≤χi<β for all i
(1)

Stillwell gives an illustration, close to a proof, in §2.6 (pp.46–47)1, for the most important special case: β=ω and ζ<ε0. (The Cantor normal form for ε0 is just ωε0, not helpful.)

Here’s a more formal proof of the full theorem. First generalize division to all ordinals. If β>1, then any ζ has a unique representation in the form

ζ= βχ+ρ,   ρ<β

Proof: since β>1, β(ζ+1)≥ζ+1, and so there is a least χ′ such that βχ′>ζ. Furthermore χ′ cannot be a limit ordinal: if βξ≤ζ for all ξ<λ, then βλ≤ζ by continuity of multiplication. Set χ=χ′−1, so

βχ≤ζ<β(χ+1)

Now write (by equation (7) of post 8)

ζ=βχ+(−βχ+ζ)=βχ+ρ

setting ρ=−βχ+ζ. We cannot have ρ≥β, otherwise

ζ=βχ+ρ≥βχ+β=β(χ+1)>ζ

Uniqueness needs to be done just right, since ordinal addition has cancellation on the left but not on the right. Suppose

βχ11=βχ22,    ρ12

If χ12 then χ21+γ for some γ>0. So

βχ11=β(χ1+γ)+ρ2=βχ1+βγ+ρ2

Cancelling on the left,

ρ1=βγ+ρ2≥βγ≥β

contradicting the definition of ρ1. So χ12=χ (say), and we can cancel βχ on the left to get ρ12.

The proof of Cantor normal form starts out in a similar fashion. First prove by induction that βα≥α for all α. Since βζ+1>ζ, it follows that there is a least α with βα>ζ. Furthermore α cannot be a limit ordinal by continuity of exponentiation (in the exponent). Set α1=α−1. So

βα1≤ζ<βα1+1

Now we divide ζ by βα1:

ζ=βα1χ11,    ρ1α1

We cannot have χ1≥β, otherwise

ζ=βα1χ11≥βα1β=βα1+1

Repeat with ρ1:

ρ1α2χ22,   χ2<β,   ρ2α2

We must have α21 because βα2≤ρ1α1. Continuing we get a descending sequence of αi’s which must terminate in finitely many steps, because ordinals. This establishes (1).

The proof of uniqueness provides a bonus: a criterion for which of two normal forms is larger. It’s what you’d expect from decimal notation. Suppose

ζ α1 χ1+···+βαk χk
ζ′ = βα1 χ1′+···+ βαm χm

are two unequal normal forms. Because cancellation on the left holds for addition, we can remove any equal terms on the left, and assume either α11′ or α11′ and χ11′ (for the reverse inequalities, just flip things around). Then ζ<ζ′. I will call this the first difference criterion for the ordering of Cantor normal forms.

The proof resembles a decimal computation. To show that 999<1000 (for example), we add 1 and do the carries. Here, we begin by combining the last two terms of ζ, using the facts that χk<β and αk−1k:

βαk-1 χk-1αk χk αk-1 χk-1αk+1
≤βαk-1 χk-1αk-1
αk-1k-1+1)

Next we combine with the previous term, using the fact that χk−1+1≤β and αk−2k−1:

βαk-2 χk-2αk-1k-1+1) ≤βαk-2 χk-2αk-1+1
≤βαk-2k-2+1)

We keep going, until eventually we have

ζ<βα11+1)

If α11′ and χ11′, then we have βα11+1)≤βα1 χ1′. Thus ζ is less than the first term of ζ′. If α11′ then we use the fact that χ1+1≤β, concluding that βα11+1)≤βα1+1≤βα1, and hence ζ is again less than the first term of ζ′. A fortiori, ζ<ζ′.

Uniqueness follows. These ordering criteria are needed for the Goodstein and Hydra theorems.

[1] However, Stillwell messes up at the end of the example. In the last bullet he says, “This means that α is a term in the following sequence with limit ωω2·7+ω·4+11+ω= ωω2·7+ω·4+12.” This equation is incorrect; the last “+ω’’ should be “·ω’’. The sequence on the next line should have “·1’’, “·2’’, etc., instead of “+1’’, “+2’’, etc. So he has many more steps to go.

Another point. When he says that α “falls between” two terms in a sequence, he’s not entitled to assume it falls strictly between. He should say instead that there are two consecutive terms with α greater than or equal to the first and less than the second.

Prev TOC Next

Leave a comment

Filed under Set Theory

Set Theory Jottings 8. Ordinal Arithmetic

Prev TOC Next

Usually one defines the ordinal operations via transfinite induction:

Continue reading

2 Comments

Filed under Set Theory

Set Theory Jottings 7. The (Cantor-Dedekind-Schröder)-Bernstein Theorem

Prev TOC Next

The trichotomy of cardinals says that for any 𝔪 and 𝔫, exactly one of these holds: 𝔪<𝔫, 𝔪=𝔫, or 𝔪>𝔫. It’s equivalent to the conjunction of these two propositions, for any two cardinals 𝔪 and 𝔫:

Continue reading

Leave a comment

Filed under History, Set Theory

Set Theory Jottings 6. Zorn’s Lemma

Prev TOC Next

Zermelo’s 1904 proof of the well-ordering theorem got a lot of blowback, as we’ve seen. On the other hand, the very next year Hamel used it to prove the existence of a so-called Hamel basis. In 1910, Steinitz made numerous applications in the theory of fields. He wrote:

Continue reading

Leave a comment

Filed under History, Set Theory

Nonstandard Models of Arithmetic 32

Prev TOC Next
Previous Paris-Harrington post

[Ed. note: This post was essentially ready two years ago, but I got distracted with other matters. If you’re seeing this for the first time, or want to refresh your memory, posts 8 and 9 introduced the Paris-Harrington theorem. Posts 21 through 24 continued the discussion, in a dialog with Bruce Smith. MW]

Continue reading

Leave a comment

Filed under Conversations, Peano Arithmetic

First-Order Categorical Logic 12

Prev TOC Next

MW: Last time we looked at the categorical rendition of “C is a model of B”:

  • Functors B:FinSet→BoolAlg and C:FinSet→BoolAlg
  • A natural transformation F:BC

where B and C are hyperdoctrines, and

  • B is syntactic: the elements of each B(n) are equivalence classes of formulas (which we agreed to call predicates);
  • C is semantic: the elements of each C(n) are relations on a domain V.

(We’ve been saying that C(n) is the set of all n-ary relations on V, but I see no need to assume that.)

Continue reading

Leave a comment

Filed under Categories, Conversations, Logic

Set Theory Jottings 5. Zermelo to the Rescue! (Part 1)

Prev TOC Next

Ernst Zermelo is remembered today chiefly for two results. His 1904 paper “Proof that every set can be well-ordered” introduced the Axiom of Choice. His 1908 paper “Investigations in the foundations of set theory” led to the most popular axiomatization of set theory. He thus claims credit for two of the letters of ZFC: Zermelo-Fraenkel with Choice.

Continue reading

5 Comments

Filed under History, Set Theory

First-Order Categorical Logic 11

Prev TOC Next

MW: Last time we justified some equations and inequalities for our adjoints: they preserve some boolean operations, and “half-preserve” some others. And we incidentally made good use of the color palette!

Continue reading

Leave a comment

Filed under Categories, Conversations, Logic