Year Zero

Awhile back, the BBC website History Extra had a post that included this tidbit:

AD 0… the date that never was

The AD years of the Christian calendar are counted from the year of Jesus Christ’s birth, and, as the number zero was then unknown to the west, Dionysius began his new Christian era as AD 1, not AD 0. …

This evoked the ire of the noted historian Thony Christie. In a post Something is Wrong on the Internet, he explained:

It was common practice to number the years according to the reign of the current monarch, emperor, despot or whatever. So for example the year that we know as 47 BCE would have been the third year of the reign of Gaius Julius Caesar.

Thus Dionysius Exiguus (Dennis the Short) would say that some event occurred in (say) “the 3rd Year of Our Lord”; in Latin, “Anno Domini”, hence AD.

For this discussion, focus your attention on the so-called ordinal number “3rd”. Zero as a cardinal number has been around a long time, but the ordinal “zeroth” remains rather uncommon. I did a Google ngram search, and “zeroth” is almost nonexistent before about 1930. As of 2000, “zero” is nearly 200 times more frequent than “zeroth”. (The alternate spelling “zeroeth” is even scarcer.)

Nowadays one rarely explicitly uses ordinals for years. When is the last time you heard someone say, “In the US election in the 2018th year of the Common Era, the Democrats did well”? Grammatically, the phrase “the 2018 US election” contains a cardinal number, not an ordinal. Thony demurs:

This distinction between cardinal and ordinal numbers becomes confused when we talk about historical years. We refer to the year five hundred CE when in fact we should be saying the five hundredth year CE, as it is an ordinal and not a cardinal. … We are enumerating the members of an ordered set not counting the number of objects in a collection. Because this is the case there cannot be a zeroeth year. End of discussion!

(Note that we still use ordinals for centuries: “the 19th century witnessed the discovery of Maxwell’s equations”.)

In the recent post There is no year zero! Thony repeats the argument:

When we talk about years we tend to use the words for cardinal numbers but in fact we are actually talking about ordinal numbers. What we call 2019 CE or AD i.e. two thousand and nineteen is in fact the two thousand and nineteenth year of the Common Era or the two thousand and nineteenth year of Our Lord. …  [T]here is and never can be a year zero, it is, as stated abve, a contradiction in terms and cannot exist. … The second decade started on the 1st January 2011 and will end on the 31st December 2020 and not on 31st December 2019 as various innumerate people would have you believe. [My emphasis]

I love the “take no prisoners” tone. Historically Thony has a strong case. But logically—is there room for another viewpoint? When we index an ordered collection, must we start the count with 1?

Of course not! Some computer languages index arrays starting with 0; others allow you to pick the initial index. In mathematics, sometimes 0 proves more convenient, sometime 1.

For example, when von Neumann laid the foundations of the modern treatment of Cantor’s ordinal numbers, he picked 0 as the first ordinal. This allowed him to define n as the set of all its preceding ordinals: n={0,…,n-1}. This has the pleasant consequence that n contains exactly n elements. (Bold indeed anyone who calls von Neumann “innumerate”.)

OK, fine for pure math and computer programs? But what about years?

It’s no surprise that the mathematically bent regret the absence of a Year 0. After all, the universe did not begin in 1 AD. You’d want the year n BCE to correspond to the index –n, but that would result in the messed-up indexing

…, -3, -2, -1, 1, 2, 3, …

Naturally our friend Dennis didn’t worry about this. As Thony notes, “The idea of BC years or dates only came into use in Early Modern period.” (And yeah, negative numbers.)

History often saddles us with inelegant conventions. Measuring spoons! US weights and measures! Electric current! Dionysius had a good rationale for starting his scheme with 1, but that rationale has evaporated into the mists of history.

Thony was on his home turf calling out the BBC for their historical goofs. But when he claims that logically we couldn’t have indexed the years


I fear he’s on shaky ground.

Shortly before 2000, Stephen Jay Gould wrote a book, Questioning the Millennium. He devoted the second chapter to the 2000 vs. 2001 debate. I leave the last word to him.

Some questions have no answers because we cannot get the required information. … Many of our most intense debates, however, are not resolvable by information of any kind, but arise from conflicts in values or modes of analysis. … A subset of these unresolvable debates—ultimately trivial, but capable of provoking great agitation, and thus the most frustrating of all—have no answers because they are about words and systems, rather than things.

[Added later: I just noticed that Sniffoy’s comment to Thony’s 2014 post makes much the same points. I must have read it at the time, since I replied to it!]


Filed under History

Non-standard Models of Arithmetic 13

Prev TOC

MW: OK, back to the main plotline. Enayat asks for a “natural” axiomatization of PAT. Personally, I don’t find PAT all that “unnatural”, but he needs this for Theorem 7. (It’s been a while, so remember that Enayat’s T is a recursively axiomatizable extension of ZF.)

He defines ΦT to be the set of all formulas


In other words, φ ranges over all sentences (closed formulas) of L(PA), and Tn ranges over all finite “initial fragments” of T (i.e., the first n sentences of T). Enayat proves that PA+ΦT  is equivalent to PAT.

Before looking at the proof of the equivalence, let’s ask: where did ΦT come from? What’s the meaning of these formulas \varphi\rightarrow\text{Con}(T_n+\varphi^\mathbb{N})? Let’s pick an example—say φ is the Goldbach conjecture. And let’s say n is a zillion. So \varphi\rightarrow\text{Con}(T_n+\varphi^\mathbb{N}) says, “If the Goldbach conjecture is true, then it’s consistent with the first zillion axioms of T.” Or in other words, if it’s true, then there’s a model M of Tn in which the Goldbach conjecture holds. Sounds pretty innocuous, doesn’t it?

The proof of the equivalence has some fine points. Start with the implication, \text{PA}^T\Rightarrow  \text{PA}+\Phi_T. Use the definition of PAT as all sentences true in all \mathbb{N}^M. Suppose M is a model of T. Now, either φ holds in \mathbb{N}^M or it doesn’t. If it doesn’t, the implication


has a false antecedent, so we’re home free! On the other hand, if φ does hold in \mathbb{N}^M, then M is a model of \varphi^\mathbb{N}, and of course also of T. In other words, M is a model of T+\varphi^\mathbb{N}. Any theory with a model is consistent, so


You might think we’re done, but not so! Notice the Tn in Enayat’s ΦT . All we’ve shown so far is that \mathbb{N} satisfies \text{Con}(T+\varphi^\mathbb{N}); we need to show that \mathbb{N}^M does.

This might seem like a minor issue. To show it’s not, let’s look at PA+¬Con(PA). (Feel familiar? The last post did the same thing with ZF in place of PA.) This is consistent! At least if PA is consistent, by Gödel’s second incompleteness theorem. Let’s call it PAinsecure, because it’s a self-doubting theory. Since it’s consistent, it has a model, call it N. \mathbb{N} satisfies Con(PAinsecure), but N does not! What gives?

Well it’s quite simple. N has a proof of a contradiction of non-standard length.

Returning to \text{Con}(T+\varphi^\mathbb{N})…  We know \mathbb{N} satisfies \text{Con}(T+\varphi^\mathbb{N}); what about \mathbb{N}^M? We need to show there’s a model of our theory inside M. In other words, a K that M thinks is a set, and moreover is a model of T+\varphi^\mathbb{N}.

Resolving this steers us straight into one of the key issues in logic: the sort-of definability of truth.

As Tarski taught us, truth is undefinable—the notion, “m is the Gödel number of a statement in L(ZF) that is true”, can’t be expressed in L(ZF). But Tarski also gave us a definition of truth, expressed using set-theoretical concepts. Examine his definition carefully, and you walk away with two useful approximations to truth:

  • For any natural number d, there’s a formula in L(ZF), Truthd(m), expressing the notion “m is the Gödel number of a statement of L(ZF) with at most d quantifiers that is true.”
  • For any set K and any binary relation E on M, there’s a formula in L(ZF), Truth(K, Em), expressing the notion “m is the Gödel number of a statement of L(ZF) that is true for the structure (K, E).”

Before Tarski and Gödel we had Skolem, who gave us the downward Löwenheim-Skolem theorem. (Well, the exact history is more complicated than I want to get into.) When you apply this to set theory (and skirt some pitfalls), you get what’s called the Reflection Principle. This tells us that there’s a model (K,E) of any finite number of true statements in L(ZF). Also K is a set. (Actually, it tells us more than that, but that’s all we need here.) Important: the Reflection Principle is a theorem of ZF.

Putting it all together: we argued, informally, that if φ holds in \mathbb{N}^M, then \text{Con}(T+\varphi^\mathbb{N}) holds in \mathbb{N}. If we could formalize this argument in ZF, then we could replace that last “\mathbb{N}” with “\mathbb{N}^M“. Well, we don’t have “the whole truth” in L(ZF), but we have “enough truth” to push this through for any finite fragment Tn of T. The upshot: \text{PA}^T\Rightarrow\text{PA}+\Phi_T.

We still have to do the reverse implication, but maybe it’s time to catch our breath.

JB: [Panting.] Yeah… let me think about this a while as I catch my breath.

First of all, what’s this about sneaking around Tarski’s theorem on the undefinability of truth by restricting to sentences with at most d quantifiers? How come nobody ever told me about this? It seems like such cheap dodge. Sort of like: “I can’t tell you which sentences for true, but any d, I can tell you whether sentences of length at most d are true.” I know it’s not the same thing, but my instinctive reaction is “Wow, you’re a fussy sort of fellow, but that’s good enough for me.” Sort of like “I can’t give you all the money you might ask for, but for any d I can give you up to d dollars.”

Of course I can guess it must have to do with how there’s no way to string all the definitions of truth into a single one. If what you’re saying is true I guess there can’t be an algorithm that, given d, prints out the definition of truth for sentences with at most d quantifiers. So they must get more and more complicated in a nasty way.

Can you say a bit about how this works? I feel it’s a big hole in my education!

MW: It’s not nearly that bad! Or even surprising when you think about it. And there certainly is an algorithm to generate the series of formulas Truthd.

Let’s back up and look at Tarski’s definition of truth, aka satisfaction (⊨). It’s pretty much the obvious induction:

  • M\models\neg\alpha if and only if not M\models\alpha
  • M\models\alpha\wedge\beta if and only if M\models\alpha and M\models\beta
  • M\models\exists x\alpha(x) if and only if for some c\in M, M\models\alpha(c)

You can write the analogous clauses for ∨ and ∀ easily enough, or you can treat these as defined symbols. (I’m being brief, but any model theory book will have the details. Or my Basics of First-order Logic.)

We’re skirting the edges of philosophy here, or at least post 5. Don’t you need to know what ‘not’, ‘and’, and ‘for some’ mean, to make sense of the right hand sides?

Well yes. Initially view this definition through the rose-colored lenses of informal set theory, the basic mathematical intuition we absorbed in infancy with our mother’s milk. But when it comes time to formalize the definition, we have a few options.

We can treat this as an inductive definition of Truthd+1 in terms of Truthd. So it goes a little like this:

\text{Truth}_{d+1}(u) is the formula (u=\ulcorner\neg\alpha\urcorner\rightarrow\neg\text{Truth}_d(\ulcorner\alpha\urcorner))\wedge \ldots\wedge(u=\ulcorner\exists x\alpha(x)\urcorner\rightarrow\exists c \text{Truth}_d(\ulcorner\alpha(c)\urcorner))

The corner brackets ⌜and⌝ as usual stand for the Gödel number of the formula, but they’re also a way to nod at some messy recursive functions—functions that can be coded into the language of PA. PA officially talks only about numbers, but via Gödel numbering it can also discuss strings of symbols. So the end of the Truthd+1 formula is saying, “If you have an existentially quantified formula ∃xα(x), then it’s true if and only if there is a number c such that α(c) is Trued; here α(c) is the result of substituting the numeral for c into the formula α(x).” The substitution function, fairly transparent at the formula level, becomes a tedious mess if you can refer only to the Gödel numbers of formulas. Kind of like those coded conversations in spy movies, where “The water buffaloes have grazed their fill and are herding towards the house with the sparrows” means something about troop movements. Still, PA is fully up to the task.

Textbooks have to spend serious time on this, but you should just take away two observations: (a) We obtain the formula for Truthd+1 by cobbling together (modified) copies of Truthd, plus connective tissue; (b) The occurrence of ∃c represents a genuine quantifier in the language of PA. In other words, the quantifiers of a closed formula α “bleed through” to the formula Truthd(⌜α⌝). Technically, Truthd is a Σd formula. So as d increases, the formulas Truthd become more and more complicated, but in a very uniform manner.

I guess we’re not actually working over PA, but over ZF. That makes it even easier. Each formula gets a Gödel set to represent it. Actually. the slickest approach is to code the formulas as parse trees. Representing finite trees as sets is something ZF can do in its sleep.

In your money analogy, we don’t have a single philanthropist, but an infinite series of philanthropists, who can give more and more moola.

As I said, this isn’t the only way to formalize truth. But as this theme will recur again and again, let’s leave it there for now.

JB: Hmm, I’m still confused about why you can’t parlay all these very systematically defined Truthd predicates into a single truth predicate, something that amounts to ∃dTruthd.

Of course I know from Tarski’s theorem on the undefinability of truth that this can’t work! But I still want to know why. It’s sort of like perpetual motion machines. If someone hands you a design for one, you can either invoke general laws to assure yourself it won’t work, or you can examine the design carefully to see exactly where it goes wrong. It all depends on how patient you’re feeling.

But okay, let’s leave it here for now: this is a bone I can chew on in my own spare time.

MW: What you’re suggesting would work if Truthd(u) was a single formula with two parameters, one being d. In other words, Truth(d,u). Then you could quantify over d. However, what we have is a sequence of ever longer and more complicated formulas indexed by d.

Prev TOC Next

Leave a comment

Filed under Conversations, Peano arithmetic

First-Order Categorical Logic 6

Prev TOC

MW: An addendum to the last post. I do have an employment opportunity for one of those pathological scaffolds: the one where B(0) is the 2-element boolean algebra, and all the B(n)’s with n>0 are trivial. It’s perfect for the semantics of a structure with an empty domain.

The empty structure has a vexed history in model theory. Traditionally, authors excluded it from the get-go, but more recently some have rescued it from the outer darkness. (Two data points: Hodges’ A Shorter Model Theory allows it, but Marker’s Model Theory: An Introduction forbids it.)

All n-ary predicates with n>0 are universally true, because they’re vacuously true. But ∃x(x=x) is false, while ∀x(x=x) is true. So B(0) contains both verum ⊤ and falsum ⊥.

The empty structure looks a little sketchy in this sense: the standard rules of logic require modification for it. (That’s why you were able to prove that its logic isn’t a hyperdoctrine.) On the other hand, its good friends the number 0 and the empty set will vouch for its good character. Hodges allows it because it makes some theorems nicer.

JB: This is very interesting. Near the end of our last conversation I claimed without proof that the Frobenius condition implies

x(⊤) = ⊤.

I now think this was wrong. Without this, I don’t think we can rule out the hyperdoctrine where B(0) is the 2-element boolean algebra but all the B(n)’s with n>0 are trivial. And that makes me happy, because I really don’t want to rule out the empty model of first-order theories—that is, what you’re calling the ’empty structure’.

So let me revise some of my statements from the last episode. I want to define an inconsistent hyperdoctine B to be one with ⊤ = ⊥ in B(0), or equivalently, one where B(0) is the trivial boolean algebra. This implies that all the B(n)’s with n>0 are trivial.

If any B(n)’s with n>0 is trivial, then they all are—our techniques from last time do show this—but this does not imply that B(0) is trivial. There is a hyperdoctrine where B(0) is the 2-element boolean algebra but all the B(n)’s with n>0 are trivial. And this one is consistent.

To really prove this we should dig deeper into the Frobenius condition—and for that matter, the Beck-Chevalley condition. If I try to explain these to you, that will force me to understand me better.

But I think we’re okay for now. So, this time it’s for real: on to Gödel’s completeness theorem!

Prev TOC

Leave a comment

Filed under Categories, Conversations, Logic

First-Order Categorical Logic 5

Prev TOC Next

JB: Okay, let me try to sketch out a more categorical approach to Gödel’s completeness theorem for first-order theories. First, I’ll take it for granted that we can express this result as the model existence theorem: a theory in first-order logic has a model if it is consistent. From this we can easily get the usual formulation: if a sentence holds in all models of a theory, it is provable in that theory.

Continue reading

Leave a comment

Filed under Categories, Conversations, Logic

Epstein Relativity Diagrams

[This post is available in pdf format, sized for small and medium screens.]

Lewis Carroll Epstein wrote a book Relativity Visualized. It’s been called “the gold nugget of relativity books”. I wouldn’t go quite that far, but Epstein has devised a completely new way to explain relativity. The key concept: the Epstein diagram. (I should mention that Relativity Visualized is a pop-sci treatment.)

Continue reading

Leave a comment

Filed under Physics, Reviews

Non-standard Models of Arithmetic 12

Prev TOC Next

JB: It’s been a long time since Part 11, so let me remind myself what we’re talking about in Enayat’s paper Standard models of arithmetic.

We’ve got a theory T that’s a recursively axiomatizable extension of ZF. We can define the ‘standard model’ of PA in any model of T, and we call this a ‘T-standard model’ of PA. Then, we let PAT to be all the closed formulas in the language of Peano arithmetic that hold in all T-standard models.

This is what Enayat wants to study: the stuff about arithmetic that’s true in all T-standard models of the natural numbers. So what does he do first?

Continue reading

Leave a comment

Filed under Conversations, Peano arithmetic

Algorithmic Information Theory

Back in the 60s, Kolmogorov and Chaitin independently found a way to connect information theory with computability theory. (They built on earlier work by Solomonoff.) Makes sense: flip a fair coin an infinite number of times, and compare the results with the output of a program. If you don’t get a 50% match, that’s pretty suspicious!

Three aspects of the theory strike me particularly. First, you can define an entropy function for finite bit strings, H(x), which shares many of the formal properties of the entropy functions of physics and communication theory. For example, there is a probability distribution P such that H(x)=−log P(x)+O(1). Next, you can give a precise definition for the concept “random infinite bit string”. In fact, you can give rather different looking definitions which turn out be equivalent; the equivalence seems “deep”. Finally, we have an analog of the halting problem: loosely speaking, what is the probability that a randomly chosen Turing machine halts? The binary expansion of this probability (denoted Ω by Chaitin) is random.

I wrote up my own notes on the theory, mostly to explain it to myself, but perhaps others might enjoy them.

Leave a comment

Filed under Logic