Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
I would like to open a new thread to discuss a slew of really basic, dumb questions I have (or will be having) about type theory in general. In asking these questions, I am exposing the fact that I am literally decades out of the loop when it comes to a detailed technical understanding of the ins and outs of type theory as it relates (eventually) to $(\infty, 1)$-categories, and I can only beg your indulgence in answering them or pointing me to references that answer them (to my satisfaction!).
These questions may on occasion seem utterly nit-picky. Please do not be offended by that. I find much of the language I see in articles (in the nLab or elsewhere) pretty confusing – I realize that here as elsewhere, much of the language is abbreviatory (is that a word?) and developed for reasons of human convenience, i.e., there may be really good reasons for it. But I will probably continue to be confused unless I ask the nit-picky questions in order to learn how to use the language correctly.
And maybe, just maybe, my noobie questions will be useful for other noobies, now or in the future. Maybe they will lead to improvements of the articles! Here are a few to get started.
I’m looking at Martin-Löf dependent type theory. First: does this article give a complete specification of the theory?
So first thing first, I understand the specification of a type theory to involve a bunch of things, but the basic framework is that it tells the user how to form “valid judgments”, each judgment expressing that either a type $A$ is valid in a context $\Gamma$ (here the judgment is by convention written $\Gamma \vdash A: Type$), or that a typing declaration $t: A$ of a term is valid in context $\Gamma$ (where the judgment is written as $\Gamma \vdash t: A$), or that an equality between terms (of given declared types in contexts), written $\Gamma \vdash (s = t): A$. I believe that usage of equality is what is often called “judgmental equality”?
Under “Syntax”, where it says we assume given a stock of variables, atomic types, and parametrized types. A possible nit-pick: I would prefer to distinguish some atomic types as “variable types” and some as “constant types” (such as $\top, \bottom$). It’s the “variable types” (or whatever people call them) that come as part of the pre-existing stock of linguistic items, whereas the constant types will get introduced with some of the type introduction/formation rules. Right? [Of course the more encompassing terminology “atomic” is convenient when specifying (valid) contexts and judgments of types in contexts.]
Re “parametrized types”: no link is provided there to explain what this means, and it feels like there should be one. But I guess what is meant is an expression like $B[x]$ where $B$ is a type and $x$ is a variable expression. Do we have a stock of parametrized types $B[x]$ for any type $B$ introduced? Can’t the process be iterated, e.g., $B[x_1][x_2]$? Is there a list of rules for valid formation of parametrized types? (For example, if we write $B[x]$, then I suppose $x$ had better not already occur freely in $B$, that sort of thing.)
I would prefer to say: “a context is a finite ordered list of variable typing declarations, defined inductively as follows…”. But to be even more nit-picky, I think I would prefer: “A context is a finite ordered list of variable typing declarations. A valid context is defined inductively as follows…” (I mean, at first it just says “context”, and then later “valid context”, so I suppose the latter is what is being defined inductively.)
But I think the question I want to know most about is about something I didn’t see mentioned anywhere on the page; sorry if I missed it. In saying which parametrized types are valid in a given context, would one of the rules be as follows? “If $\Gamma \vdash B: Type$ is valid and if $\Gamma, x: A$ is a valid context, then $\Gamma, x: A \vdash B[x]: Type$ is valid.”
One last question for now: in writing “$true$”, within the context of just Martin-Löf dependent type theory, nothing fancier (like full blown HoTT), is $true$ supposed to be a formal type, or is the use of this string of letters just a convention, similar to the convention of using “$Type$” in a judgment like $\Gamma \vdash A: Type$?
Under “Syntax”, where it says we assume given a stock of variables, atomic types, and parametrized types.
I think this assumption may be misleading. I would treat the various kinds of judgement that you mention as purely syntactic. So we are allowed to write $MyFavouriteString : Type$, where $MyFavouriteString$ can be replaced by any string you like, such as $A$ or $\mathsf{Group}$ or whatever. No mathematical significance whatsoever, we are just saying that we are allowed to write such strings. We take certain such strings as our starting point (which we may write as ’rules’
$\frac{}{MyFavouriteString : Type}$with empty numerator). We say that those strings are (primitive) ’valid judgments’, or ’axioms’. And now we just have a collection of rules for building up syntax from valid judgements.
If for instance we have a valid judgement of the form $MyFavouriteString : Type$, we are then permitted to write $MySecondFavouriteString : MyFavouriteString$. We may write this as
$MyFavouriteString : Type \vdash MySecondFavouriteString : MyFavouriteString.$At this point, I would say that the latter is not a valid judgement, it is just a legitimate piece of syntax. We might take certain such pieces of syntax as part of our primitive valid judgements as well, i.e. as axioms. We might then instead write the above as $\frac{MyFavouriteString : Type}{MySecondFavouriteString : MyFavouriteString}$. But there is no a priori significance to this, it is just a syntactical shorthand.
If we know that $MySecondFavouriteString : MyFavouriteString$ is valid, we are permitted to write
$MyThirdFavouriteString(MySecondFavouriteString) : Type.$Again, just a piece of syntax at this point. To fully justify/express concisely why this piece of syntax is permitted, we may write
$MyFavouriteString : Type, MySecondFavouriteString : MyFavouriteString \vdash MyThirdFavouriteString(MySecondFavouriteString) : Type.$This is going to get a bit tedious, so we often write $\Gamma$ for part or all of the left of the $\vdash$.
I guess you will see what I am getting at by now. There is in a way no such thing as variables, atomic types, and parametrised types, certainly not a priori: we just have certain permissible syntactical constructions; certain of these we deem to be valid judgements; we have rules for permissible pieces of syntax given valid judgements; and we have rules for which permissible pieces of syntax are valid judgements. I.e. it’s all just a big syntactical induction.
In programming languages with mutually inductive-recursive types (I may have the terminology wrong here), one can even express this formally, i.e. one can cook up a ’type theory’ as one gargantuan mutually inductive-recursive type.
Martin-Löf has a philosophical justification for why his rules for forming valid judgements are valid, but one can justify those rules in pragmatic ways as well.
Now, do take with a pinch of salt that this my own way of understanding this, and it may not be completely canonical. But I certainly think it is a valid way to understand it.
There is in a way no such thing as variables
I’ll need more time to read your comment (thanks!), but perhaps to be compared with Tarski’s way of doing logic without variables? E.g., his book with Steven Givant, A Formalization of Set Theory Without Variables.
I also think of string diagrams (as in the stuff I did on Peirce’s Beta with Gerry Brady) as a possible way of doing logic without variables.
Those two references sound very interesting! I am not familiar with them, but I would guess that they are slightly different: I would say that one basically does have things corresponding to variables in type theory (Martin-Löf or otherwise). For example if one writes $n : \mathbb{N} \vdash n^{2} + 2 : \mathbb{N}$, then $n$ is basically a ’variable’. I was suggesting more that we do not need to say what a variable means in order to define a type theory, it is rather an a postereori notion; this I guess is the standard point of view in first order logic/set theory as well.
From the purely syntactical point of view of #2, Martin-Löf’s principal innovation was his introduction of syntax of the form $A : Type \vdash A(x) : Type$ in rules for valid judgements (i.e. of ’dependent types’), his justification of those rules, and his illustration of how they are useful.
While I’m here, I can have a go at your other questions.
I believe that usage of equality is what is often called “judgmental equality”
Yes. I.e. a purely syntactical thing: the symbol $=$ is one of a handful of strings which are picked out to be used in permissible syntax, along with $:$ and $Type$.
Can’t the process be iterated, e.g., $B[x_1][x_2]?
Not quite as such. As below, $B(x)$ is a single syntactical entity, so it doesn’t make sense to write $B(x_1)(x_2) : Type$ per se. But of course one can have types depending on dependent types, e.g. $x : A, B(x) : Type, y : B(x) \vdash C(y) : Type$ is a permissible piece of syntax, and a valid judgement, in a context in which $x : A$ is a valid judgement. One might abbreviate that to $B(x)(y) : Type$ or $B(x,y) : Type$, but one has to be careful to understand what it expands to.
Is there a list of rules for valid formation of parametrized types? (For example, if we write $B[x]$, then I suppose $x$ had better not already occur freely in $B$, that sort of thing.)
I think one can do everything according to the simple rules for permissible syntax of #2, and the rules one chooses to say which of those pieces of syntax are valid judgements. It will be mightily confusing in practise if one repeats symbols, but if one really writes out the full context, one can repeat without ambiguity (one may end up repeating an entire piece of syntax, but, whilst superfluous, this is harmless).
In saying which parametrized types are valid in a given context, would one of the rules be as follows? “If $\Gamma \vdash B: Type$ is valid and if $\Gamma, x: A$ is a valid context, then $\Gamma, x: A \vdash B[x]: Type$ is valid.”
I think this is not quite correct. The $B(x)$ is one entire syntactic entity, it does not presuppose a $B$.
One last question for now: in writing “$true$”, within the context of just Martin-Löf dependent type theory, nothing fancier (like full blown HoTT), is $true$ supposed to be a formal type, or is the use of this string of letters just a convention, similar to the convention of using “$Type$” in a judgment like $\Gamma \vdash A: Type$?
The way it is used at Martin-Löf dependent type theory, I would say it is just a convention, i.e. just another piece of permissible syntax, along with rules for saying which pieces of permissible syntax involving it are valid judgements.
Thanks for having a go at my questions, Richard.
Let me say that I am just trying to understand the rules as they are written on that nLab page. (I should maybe save people some time by saying that I do have some background in formal logic, for example sequent calculus and categorical semantics of various substructural fragments of full-blown sequent calculus for first-order logic, so that it might not be totally necessary to explain what an axiom is, for instance.)
I’m willing to accept things like
I think this is not quite correct. The $B(x)$ is one entire syntactic entity, it does not presuppose a $B$
as reflecting a misunderstanding on my part – I’m was making a guess there, after all. But then could someone explain which rules described on that page would permit a valid judgment of the form
$\Gamma \vdash B(x): Type$Because according to what I read, I was not (for example) interpreting $B(x)$ as an “atomic type” so that one could derive that judgment according to what was said two lines above Binary product types. And yet when you (Richard) say “the $B(x)$ is one entire syntactic entity, it does not presuppose a $B$”, that sounds almost as if you do treat that bit of syntax as “atomic”.
I should maybe save people some time by saying that I do have some background in formal logic
I believe you have many more times background than me :-). I thought I’d try to explain, with an unknown reader in mind, how to view even very basic things as purely formal, inductively constructed syntactic entities; sometimes those basic things are amongst the hardest to understand. For me, it was actually quite a radical idea when I first understood it some years ago (through pondering some writings of Martin-Löf) to view things like this; it certainly goes beyond what is done in ordinary logic/set theory. Apologies for anything that was stating the obvious!
that sounds almost as if you do treat that bit of syntax as “atomic”.
I think that maybe the word ’atomic’ has potential to be confusing, but yes, definitely I see $\Gamma \vdash B(x) : Type$ as saying that we can form the entire piece of syntax $B(x) : Type$ given a list of permissible pieces of syntax $\Gamma$ ending with $x : A$ for some $A : Type$. And I definitely also see there as being nothing more to it than that. I don’t see the two lines above ’Binary product types’ as allowing one to derive anything; actually I think those two lines are more or less tautologous (and I would drop the word ’atomic’ from ’atomic type’).
I guess one would need $x:SomeType$ somewhere in $\Gamma$.
One thing that might be tangentially helpful here is that in one of his earlier works, predating I think his work on type theory, Martin-Löf makes a few remarks about what it means to construct some mathematical object (I think the work may be Notes on constructive mathematics). He breaks this down into at first literally constructing 5 as five marks $| | | | |$, or perhaps four such marks with a line through, on a piece of paper. In other words, constructing means literally writing it down. I see the highly syntactical point of view of #2 as coming out of that. Note that the standard ’meaning’ of ’for all’ in ordinary logic/set theory of course already violates this.
Re #7, #8: okay, good, but I would still like to see some clarity on this point somewhere on the page (again, maybe I missed it by reading too quickly). Plus, a few brief words on the syntax of ’parametrized types’ would be good to have there.
I still have some lingering issue with a ’$B$’ not being presupposed when one writes $B(x)$. I’m willing to go along if that’s what the experts say, but then I’m not quite sure what to make of things I read in Seely, page 2 of 16, where under 1.1.1 Type formation rules (page 2 of 16), I see “(iv) If $A, B[x]$ are types, $x \in A$, where $x, B$ satisfy the c.o.v. [condition on variables]…” – what’s that $B$ then? (The thing that Richard told me was not quite right correct in #5 was my guess partly based on what I was reading in Seely.)
actually I think those two lines are more or less tautologous
Perhaps on an intuitive level, yes, but at the moment I’m taking a hard-nosed formalist approach where I want to know how to apply the rules correctly, where I definitely don’t see any tautology there! You can’t skip over such ’minutiae’ and have a correct specification! :-)
Someone else may like to take up the discussion from a point of view closer to that currently on the page. I just wished to remark that I view #2 as completely rigorous and formal, indeed more so than what is currently on the page. (Obviously, I do not give all the rules in #2, but I feel that this is beside the point for fundamentally understanding).
Perhaps on an intuitive level, yes
On a fully formal level, those two lines do not make any non-tautologous sense at all in my opinion, in particular they are not rules. To say that $A$ is a type means exactly that the syntax $A : Type$ is a valid judgement (possibly in context), nothing more and nothing less. Valid judgement is defined inductively in the manner already discussed.
they are not rules
They look like axioms to me. As I thought you also were saying in #2.
Interesting! I view them as quite different. The following
For any context $\Gamma$, if $A$ is an atomic type, $\Gamma \vdash A : \mathrm{Type}$
requires, as I see it, a pre-existing notion of ’atomic type’, and is also ambiguous about the role of $\Gamma$.
In #2, there is no pre-existing notion of atomic type: the syntax $MyString : Type$ is always legitimate, for any string $MyString$ you like, and for certain strings (in certain contexts) it is a valid judgement. The only way to exhibit that it is a valid judgement is to start from some axiomatic valid judgements and apply rules to determine which pieces of syntax constructable from these are valid judgements.
In particular, whilst the syntax $A : Type$ is always legitimate, it is not an axiom that this syntax is a valid judgement: only a certain collection of such pieces of syntax will be taken axiomatically to be valid judgements.
I have a feeling that there is less fundamental disagreement than may appear in this discussion. I don’t quite understand what distinction is trying to be drawn by opposing “legitimate” to “valid” though, or what would then count as illegitimate. But as I say, I want to focus on the fully formal aspects, where the fundamental objects are judgments in context, and I want to be quite sure that the page accurately captures the calculus.
I tried to say earlier what I thought “atomic type” meant, but of course I would welcome some discussion on that.
As far as the role of $\Gamma$ goes: the only role I see is that judgments of this sort take the form $\Gamma \vdash A: Type$, i.e., we always specify the context $\Gamma$ in forming judgments, so it has to be there. (It could be empty of course.) “Tautology” doesn’t seem like a bad word for what we’re discussing (a universal truth in formal logic), but my own background in sequent calculus is that “axiom” is another term for it, e.g., one refers to an “identity axiom” $A \vdash A$ (usually not an “identity tautology” in the argot I’m familiar with). Again, it doesn’t sound to me that the disagreements on this particular matter are fundamental.
I don’t quite understand what distinction is trying to be drawn by opposing “legitimate” to “valid” though, or what would then count as illegitimate.
I think the distinction Richard might be trying to make - Richard, do please correct me if I’m wrong! - is the distinction in the XML world between something being “well-formed” and something being “valid”. An XML document can be ’well-formed’, but not ’valid’: that is, it can obey the syntactic rules applicable to all XML documents, yet still contain markup or content that is incorrect according to the more specific set of rules applicable to that specific sort of XML document. So if there was an XML document containing the text:
<inequality>1 < 2</inequality>
Then regardless of anything else, that text is not ’well-formed’, because it contains a literal “<” that isn’t being used to indicate the start of a markup tag. But if there was an XML document containing the text:
<greeting>Hello world</greeting>
then that’s ’well-formed’, but only ’valid’ if “greeting” is a valid markup tag for that sort of XML document.
So when Richard writes:
whilst the syntax $A : Type$ is always legitimate, it is not an axiom that this syntax is a valid judgement: only a certain collection of such pieces of syntax will be taken axiomatically to be valid judgements.
I believe he’s saying that $A : Type$ is always ’well-formed’ (syntactically speaking); but that in any given system, only a particular collection of well-formed statements will also be considered axiomatically ’valid’ as well.
If you really want you could model all these rules on strings of characters but I don’t think this would add anything. Then “atomic types” and judgements are just valid syntax in some language etc..
With due respect to Richard, I think that a lot of what he is saying is either misleading or not addressing the real questions. Let me have a go; but first let me say that these are not dumb questions (and not even in the hackneyed sense of “there are no dumb questions”). Some of them are actual problems with the article you’re looking at (which we should fix as part of the conclusion of this discussion); others are serious issues that many people (sometimes even experts) get confused about.
First: does this article give a complete specification of the theory?
I believe this article is intended to give a complete specification of a theory. I didn’t write this article myself, and I have not read it carefully enough to know whether it’s correct or whether something is missing.
I believe that usage of equality is what is often called “judgmental equality”?
Yes. Although I think one usually doesn’t put parentheses around the equality, writing just $\Gamma \vdash s=t:A$; writing $(s=t):A$ looks like you are judging “$s=t$” to be a term of type $A$. (Which is in fact something that one might do for a different meaning of equality: if one is using $=$ for the propositional or typal equality, then $s=t$ is a type, which might be a term belonging to a universe.)
For a similar reason, it’s more usual not to write a colon in the “is a type” judgment, instead saying $\Gamma \vdash A \, type$; writing $\Gamma \vdash A:Type$ is usually reserved for the case when you have a universe of types called “$Type$” and $A$ is a term belonging to it (any such term is, or can be coerced to, a type, but for a given type universe not every type will belong to it).
I would prefer to distinguish some atomic types as “variable types” and some as “constant types” (such as $\top, \bottom$).
I would not regard $\top,\bottom$ as “atomic”, just as in propositional logic I would not regard them as part of the “theory”. In the latter case they are nullary connectives; in type theory they are nullary type forming operations. I would not use the word “variable” for atomic types either, to avoid confusion with the “variables” that appear in a context and also the “meta-variables” (see below) that are used when specifying the theory. Sometimes people say “axiomatic” instead of “atomic”.
Re “parametrized types”
I think what is meant here (at least, what should be meant) is that in a dependent type theory, some of the “atomic types” can be dependent on other atomic types, or on other types built from them. That is, in addition to an atomic type $\mathtt{A}$, which gives rise to a rule
$\frac{\quad}{\Gamma \vdash \mathtt{A}\, type}$we could have an atomic type $\mathtt{B}$ dependent on $\mathtt{A}$, which would give rise to a rule like
$\frac{\quad}{\Gamma, x: \mathtt{A} \vdash \mathtt{B}(x) \, type}$(Although a type theorist would usually write this latter rule as
$\frac{\Gamma \vdash a:\mathtt{A}}{\Gamma \vdash \mathtt{B}(a) \, type}$to make substitution admissible (see below).) This is actually quite tricky to make precise: instead of taking the atomic types (or, at least, their typing rules) to be given at the start, you have to specify them inductively along with the rest of the theory. E.g. the atomic types are a well-ordered list in which each element is a symbol $\mathtt{C}$ along with a judgment
$x_1:A_1, \dots, x_n:A_n \vdash \mathtt{C}(x_1,\dots,x_n) \, type$in which $x_1:A_1, \dots, x_n:A_n$ is a valid context according to the type theory generated by the atomic types appearing earlier than $\mathtt{C}$.
In addition, for completeness there should also be atomic terms as well as atomic types.
I would prefer to say: “a context is a finite ordered list of variable typing declarations,
Given the way this article is using the term “typing declaration”, you are absolutely right: the subjects of the typing declarations in the context must be variables. I don’t know that that usage is standard however.
“A context is a finite ordered list of variable typing declarations. A valid context is defined inductively as follows
You’re right here too. Indeed, the way the bullet points are phrased in the article suggests that its author was implicitly thinking of that too, since the phrase “the context obtained by … is valid” only makes sense if there is a notion of “not-necessarily-valid context”. When being careful about this, people sometimes include a separate judgment for “being a valid context”, say “$\Gamma\,ok$”, and then the definition of context-validity here becomes formal rules like
$\frac{\quad}{\emptyset \,ok}$and
$\frac{\Gamma \,ok \qquad \Gamma\vdash A\,type \qquad x\notin \Gamma}{(\Gamma,x:A) \,ok}$This approach automatically incorporates the idea that context validity is defined mutually-inductively with type-validity (and term-validity, since that gets all mixed up with type-validity as well in a dependently typed theory).
However, another approach is to regard the type- and term-validity judgments as ranging over judgments with possibly-invalid contexts. In that case, the notion of valid context does not have to be defined mutually-inductively; instead it can be defined simply on its own after the (mutual) type and term judgments have been stated. And (assuming the rules are stated correctly) we can prove, after the fact, that in any valid judgment $\Gamma \vdash A\,type$ or $\Gamma \vdash t:A$, it happens that $\Gamma$ is a valid context in this sense.
would one of the rules be as follows? “If $\Gamma \vdash B: Type$ is valid and if $\Gamma, x: A$ is a valid context, then $\Gamma, x: A \vdash B[x]: Type$ is valid.”
There are really two issues here. The first is one of notation: your conclusion would generally be written as simply $\Gamma, x: A \vdash B: Type$.
To explain this further (if necessary), note that $\Gamma$, $A$, and $B$ here are technically “meta-variables”: we are using them to stand for an arbitrary context and two arbitrary types. This is just the usual mathematical use of variables to stand for things, except that we have to be careful to distinguish it from the internal “variables” that occur syntactically in the terms of type theory. In fact, here $x$ is a meta-variable that stands for an arbitrary syntactic variable! The types that $A,B$ stand for are syntactic strings like $(\Sigma y:\mathtt{C})\mathtt{D}(y,z)$ (here I’m using the article’s notation for $\Sigma$-types, although it is pretty non-standard), built up from the type-forming operations out of the atomic types (here $\mathtt{C}$ and $\mathtt{D}$ are intended to be atomic types). If such a type is to be valid in some context, then in particular all of its free variables must be declared in that context.
Now, validity of $\Gamma \vdash B\, type$ implies that the variable $x$ doesn’t occur syntactically in (the syntactic string) $B$. Therefore, it makes sense to consider $B$ itself in context $\Gamma,x:A$ as well, and the rule you quote (most commonly called “weakening”) should certainly be true: adding new irrelevant declarations to the context doesn’t invalidate a type (or a term, for that matter).
However (and this is the second issue), it’s not usual to give weakening as a primitive rule, meaning one of the explicit inductive rules that we use to define type validity. Instead, we formulate all the other primitive rules in such a way that after the fact we can prove that whenever $\Gamma \vdash B\, type$ is valid and $\Gamma, x: A$ is a valid context then $\Gamma, x: A \vdash B\, type$ is also valid. Such an after-the-fact theorem is called an “admissible rule”. The reasons for making weakening admissible rather than primitive are rather subtle and complicated (and not completely understood even by me), but if you want I can try to explain my understanding.
in writing “$true$”, within the context of just Martin-Löf dependent type theory, nothing fancier (like full blown HoTT), is $true$ supposed to be a formal type, or is the use of this string of letters just a convention
As with the “$A \, type$” judgment, this judgment is usually written without a colon as “$A\, true$”, with “$true$” just part of the judgment syntax (like the colon and the turnstile) rather than a type inside the theory.
By the way, this article’s treatment of equality is also kind of weird. It seems to be giving the usual rules for judgmental equality, but instead regarding them as rules constructing unspecified terms belonging to an equality type, and seems to omit any rule explaining how to transport or substitute along such equalities. I haven’t ever seen this done and am somewhat skeptical that it works; is it taken from one of the references?
I also disagree with the article’s claim that “it is essential, for a purpose such as this, that the type-formation clauses be regarded as an inductive definition…”. The type-formation clauses are an inductive definition (there is no question of how to “regard” them), but they correspond semantically to operations on a model of type theory which might not itself be inductively defined, and the corresponding operations for equality (even formulated as in the article, as far as I can see) make perfect sense.
Also: what should be done in a formal presentation, and might clarify some of these questions (and certainly address Richard’s points about allowable strings), is to start out by giving a formal definition of the raw syntax, along the following lines.
We suppose given an infinite set $Var$ whose elements we call “variables”, and a set $Atom$ whose elements we call “atomic terms” (including atomic types). Then the set of (raw) terms (including raw types) is defined inductively as follows:
There are two ways to regard such a definition. On one hand, we can regard it as defining the set of terms as a subset of the set of finite lists (“strings”) of symbols drawn from a given alphabet $Var \cup Atom \cup \{ \times, \langle,\rangle,(,),\pi_0,\pi_1,\Pi,:,\lambda,\dots \}$. Note that parentheses, colons, commas, and so on are all symbols in the alphabet. In addition, this point of view requires us to alter some of the clauses to include extra parentheses, e.g. we have to write $(A\times B)$ so that $((A\times B)\times C)$ has an unambiguous meaning. This sort of approach seems to be common for introducing the syntax of (say) first-order logic in mathematical logic textbooks, at least it was when I was an undergrad.
However, a more modern perspective is to regard this as defining an inductive type (in the metatheory), with each clause as a constructor. In this case, the terms themselves are not lists of symbols with some property, but rather trees whose nodes are labeled with appropriate symbols. This way we obtain automatically an induction principle for defining functions by recursion and proving theorems by induction over the set of terms, which in the strings-of-symbols approach one has to prove by hand. The strings of symbols we write on the page like $(A\times B)\times C$ are then regarded as just a notation for these “abstract syntax trees”.
It’s true that to be completely precise about the entire passage from “mathematics written on the page” to its meaning, one would want to also define the strings of symbols and prove that they faithfully represent the syntax trees. This is analogous to how a compiler for a programming language must start out by “parsing” and “lexing” the code written by the programmer (which is a string of symbols) into an internal “abstract syntax tree” representation. However, such things are well-understood and extremely uninteresting, so much so that programmers often use “parser generator” programs to write parsers for them; the interesting and nontrivial mathematics starts once we have an abstract syntax tree.
First, a quick response to Alexis:
between something being “well-formed” and something being “valid”
Yes, thanks. I had thought that at some point, and then for some reason forgot it.
In general (this is not to Alexis but to everyone), sticking to traditional logical terminology would help me in this discussion. For example, the notion of well-formed formula is very familiar to me. I’d also like to emphasize that at this point in the discussion, I’m really interested more in linguistic usage in type theory than I am about intuitions for how I should think of the referents. Perhaps I’m being overconfident, but to give one example, I feel very comfortable with the categorical semantics of extensional dependent type theory in locally cartesian closed categories, and probably don’t need much help there. (I also think I have an okay grasp of the relation of intensional dependent type theory to $(\infty, 1)$-categories, although I’m less fluent there.)
I suspect that being “well-formed” is roughly akin to being parsable into a raw syntax tree as in #19. So if we work directly with abstract syntax, then we never have to deal with anything “ill-formed”.
And now let me interact with Mike’s responses (I’m very glad to be hearing from you, Mike! – thanks for such a careful and thoughtful response). Please forgive any lawyer-like tone in what I might say; I sometimes feel as if I’m interrogating (well, I guess I am in a way).
Also, this is really my response to Mike’s #17. Mike’s #19 makes me question to some degree what I’ll be writing here and whether I’ve managed to confuse myself further, but let me just go ahead and expose those possible confusions.
I believe this article is intended to give a complete specification of a theory.
I was under the impression that Martin-Löf dependent type theory (which I’ll abbreviate to MLDTT) was well-defined enough so that one could speak of the theory, modulo the issue of treating equality extensionally versus intensionally. It certainly appears that the present article intends to describe the extensional version. Well, I should be more careful since I guess HoTT in all its variants are forms of MLDTT. So I guess what I mean is that I thought there was an acknowledged “core” of MLDTT that the page means to describe, which would include dependent products, dependent sums, and identity types, and maybe nothing more [such as positing inductive types]. Should that view of mine be amended?
Also I thought there was indeed something missing. I’ve scanned the page a few times and never saw any rule that looks anything like
$\frac{\Gamma \vdash a:\mathtt{A}}{\Gamma \vdash \mathtt{B}(a) \, type}$where a dependent type gets introduced in the conclusion of a derivation. And that to me was missing something important, “tautology” or not. (Again, I’m trying to pay hard-nosed attention to the formalism as written, and not to intuitions along the lines of “tautology” or “it goes without saying that…”)
Oh, by the way: it’s been such a long time since I’ve used the language (that I dimly remember from sequent calculus) that I may have forgotten. In a derivation (is that what you call it in type theory?), do you call the judgments that appear “above the bar” in the application of a rule premises, or hypotheses (or either/or)? And do you call the judgment “below the bar” the conclusion of that rule? Unless I’m advised differently, I think I would tend to say “premises” and “conclusion” (of the rule-application).
writing just $\Gamma \vdash s=t:A$; writing $(s=t):A$ looks like you are judging “$s=t$” to be a term of type $A$
Thanks for that warning!
instead saying $\Gamma \vdash A \, type$
Ah, good; that comes as a slight relief. As long as the page is dealing with what I’m calling “core MLDTT”, and in particular not with universes, I’d feel better having it this way.
I would not regard $\top,\bottom$ as “atomic”, just as in propositional logic I would not regard them as part of the “theory”. In the latter case they are nullary connectives; in type theory they are nullary type forming operations. I would not use the word “variable” for atomic types either, to avoid confusion with the “variables” that appear in a context and also the “meta-variables” (see below) that are used when specifying the theory. Sometimes people say “axiomatic” instead of “atomic”.
Okay, fair enough; thanks. I agree now that there was something old-fashioned about the way I was trying to use “atomic” there. (It reminds me: I don’t like how I sometimes see people use the term “literal”, as in both a variable $x$ and its negation $\neg x$ are called “literals” in discussions of for example disjunctive normal forms in the propositional calculus.) I’m happy to go along with however way the type theorists use the word “atomic”.
I guess I could explain it to myself [oh, but I’ll have to take this back in a minute by something further you say! so take the following merely as a first approximation] by saying that “atomic type” means a type “without connectives, including the nullary ones”, “without identity/equality” (as a type-forming operation), and “without quantifications”, by which I mean without dependent sums and products in the formula.
But: the current page does appear to make a clean distinction between “atomic types” and “parametrized types”. And according to what you say for example here:
I think what is meant here (at least, what should be meant) is that in a dependent type theory, some of the “atomic types” can be dependent on other atomic types, or on other types built from them.
it sounds at the very least that something important to get straight on is being sloughed over. Well, I guess we’ll get there in time.
But it seems one really has to pay very close attention! You said “atomic types can be dependent… on other types built from (other atomic types)”. Ah-hah! So does this mean that we can consider something like
$(A \times B)(x)$as being a well-formed expression [where I’m using $A, B$ as meta-variables for which we can substitute general types], and, I presume importantly, that this expression is considered an atomic type? To put it in slightly picturesque form: each time we freshly introduce a dependency in type-formation, that new type $B(x)$ is deemed atomic?
Well, I don’t know that I want to go too much farther down this path of pursuing what is meant by “atomic type”, because I’m beginning to wonder if that vocabulary isn’t simply an excellent way of engendering further confusion in the reader. I think the most I might venture at this point is my second approximation: the idea that a type $X$ could be considered “atomic” as long as the last application used to form $X$ is not one of the connectives, or formation of an identity type $id(s, t)$, or a “quantification”. In particular, if we start with any type expression $A$ and form the expression $A(x)$, then that denotes an atomic type. But that wouldn’t quite fit the scheme on the page, where it says “For any [valid] context $\Gamma$, if $A$ is an atomic type, [then] $\Gamma \vdash A$ [is a valid judgment]”. It would have to be a little more subtle than that in the case of dependent atomic types. (Hope it’s clear what I’m saying.)
Well, I’m sure you’ll let me know if I misinterpreted you here.
I don’t know that that usage is standard however.
I had been under the impression that any expression $t: A$ was called a type declaration, declaring that $t$ is of type $A$. Although maybe that would be considered silly: if you’re doing everything correctly, you never need to “declare” any such thing – only at the outset where you start with a variable term do you say, “okay, I declare that this $x$ here is going to be of type $A$”.
When being careful about this, people sometimes include a separate judgment for “being a valid context”, say “$\Gamma\,ok$”
Something about this I like. It almost feels like an interaction with a proof assistant (which I’ve never used, so I mean how I imagine a proof assistant behaving): you type in an expression, and the proof assistant lets you know if it’s happy with what you wrote.
However, another approach is to regard the type- and term-validity judgments as ranging over judgments with possibly-invalid contexts. In that case, the notion of valid context does not have to be defined mutually-inductively; instead it can be defined simply on its own after the (mutual) type and term judgments have been stated. And (assuming the rules are stated correctly) we can prove, after the fact, that in any valid judgment $\Gamma \vdash A\,type$ or $\Gamma \vdash t:A$, it happens that $\Gamma$ is a valid context in this sense.
Okay, I think I sort of follow this – at least I easily believe something like that could be done. Not sure which approach I’d find more compelling or approachable. If the latter, then this remark definitely deserves to be made.
Before I try to respond to the next part of what you wrote, an understanding of which looks critical in dealing with some of the questions I’m asking, I’ll move to the last part:
this judgment is usually written without a colon as “$A\, true$”, with “$true$” just part of the judgment syntax (like the colon and the turnstile)
Okay, great. Might $true$ (or something like along such lines) be a type in one of the HoTT elaborations of MLDTT? The usage here does remind me of the discussion surrounding propositional types.
(A naive and casually put question, surely – I admit I’m not trying hard to think it through on my own.)
There are really two issues here. The first is one of notation: your conclusion would generally be written as simply $\Gamma, x: A \vdash B: Type$.
Okay, I find myself starting to get confused by what you’re saying in that paragraph.
Maybe it would be wise or productive to assume or pretend that I didn’t know what I was talking about when I wrote the thing you are responding to here. At the time of writing, what I was trying to do is imagine what sort of rule (that seemed to me missing from the page) would allow one to have as its conclusion a judgment of the form $\Gamma \vdash B(x)\, Type$. (Because, after all, there are rules on the page like “If $\Gamma, x: X \vdash A(x)\, Type$, then $\Gamma \vdash (\prod x: X) A(x)\, Type$” – that premise had to come from somewhere.)
Now I think you probably already answered that question earlier in your response: that type theorists have a rule
$\frac{\Gamma \vdash a:\mathtt{A}}{\Gamma \vdash \mathtt{B}(a) \, type}$and so, assuming $x: A$ is a typing declaration in $\Gamma$, we may infer $\Gamma \vdash x: A$ and hence we could derive $\Gamma \vdash B(x)\, Type$ as a special case of this rule. Are there other subtleties that need pointing out?
Again, thanks for taking time out of your day to answer questions!
I was under the impression that Martin-Löf dependent type theory (which I’ll abbreviate to MLDTT) was well-defined enough so that one could speak of the theory,
Yes, I think that’s basically right. (The usual abbreviation is just MLTT; I don’t think any important non-dependent type theories are attributed to Martin-Löf.) What I meant was, the page as currently written describes some dependent type theory (with, as I see now, some omissions), but it’s not clear to me that this type theory coincides with MLTT.
Perhaps the best way to clarify things would be for me to rewrite the page the way I would write it, and then you could read that and see what questions remain. Unfortunately it might be a little while before I have the time for such an exercise, so let me try to respond quickly.
I’m not actually sure how “type theorists” use the word “atomic”; maybe a “real” type theorist can weigh in. It could be that some of them use it with an implication of non-dependency, which maybe was what the author of the page had in mind by distinguishing atomic from “parametrised”. Maybe “axiomatic” is a better word for what I meant? In any case my point was that whatever words we use for it, the important distinction is between types (possibly including dependent types) that are not assigned any “meaning” by the theory (hence could be interpreted by anything at all in a model) and type forming operations (including the nullary case) that are assigned a particular meaning (and hence must be interpreted by something with some universal property). If we think of syntax as a free model of some (essentially) algebraic theory, then the atomic/axiomatic types are the generators of the free structure (akin to the generators of a free group), while the operations/rules are the operations of the theory (akin to the group multiplication in a free group).
So does this mean that we can consider something like $(A \times B)(x)$ as being a well-formed expression [where I’m using $A, B$ as meta-variables for which we can substitute general types], and, I presume importantly, that this expression is considered an atomic type?
No. (-:
We can’t “introduce new dependencies” willy-nilly. There’s no way to start with a type that doesn’t depend on $x$ and “make it depend on $x$”. Each piece of syntax comes with information about what arguments it can be applied to, which includes the case of types depending on variables. For instance, take the “cases” construct that eliminates out of a sum type: we can only write it as $cases(s,c,c')$ with three arguments, we aren’t allowed to write $cases(s)$ or $cases(s,c,c',c'')$ or just $cases$. Similarly, if $\mathtt{B}$ is an axiomatic type that is specified, as part of the “signature”, to depend on one copy of the axiomatic type $\mathtt{A}$, then we can only write $\mathtt{B}(a)$ (where $a$ is of type $\mathtt{A}$); we aren’t allowed to write $\mathtt{B}(a,a')$ or $\mathtt{B}(a)(a')$ or just $\mathtt{B}$. This is what the rule
$\frac{\Gamma \vdash a:\mathtt{A}}{\Gamma \vdash \mathtt{B}(a) \, type}$means: here $\mathtt{B}$ is a specific axiomatic dependent type, not a metavariable representing an arbitrary type, and the presence of this rule comes from a “declaration in the signature” that $\mathtt{B}$ depends on one copy of $\mathtt{A}$. A different signature for $\mathtt{B}$ would give rise to a different rule for forming types involving it, for instance if we declared $\mathtt{B}$ to depend on one copy of $\mathtt{A}$ and one copy of $\mathtt{C}$ then the rule would instead be
$\frac{\Gamma \vdash a:\mathtt{A}\qquad \Gamma\vdash c:\mathtt{C}}{\Gamma \vdash \mathtt{B}(a,c) \, type}$But for any given signature, each axiomatic type has only one formation rule, and there’s no general rule allowing us to “add parentheses” to an arbitrary type.
More later…
Oh, okay – maybe it’s beginning to percolate down to my brain. So we could take something mundane like the MLTT of categories (which roughly speaking is the syntax for the theory of categories in the doctrine of locally cartesian closed categories), where you have a type $O$ and a dependent type $M(x, y)$ where $x, y$ are variables of type $O$. That dependency is expressing the part of the signature of the theory which in ordinary parlance deals with the pairing $\langle dom, cod \rangle: Mor \to Ob \times Ob$. Does that seem more on the right track?
Yes, that’s exactly the idea.
Maybe some of the confusions stem from blurring syntax with meta-syntax as happens with notations like $B(a)$, $B[a]$, $\mathrm{apply}(B,a)$?
For instance, the nlab article uses the notations $B(a)$ and $\mathrm{apply}(B,a)$, which as far as I can tell mean different things (correct me if I’m wrong): $\mathrm{apply}(B,a)$ is part of MLTT syntax, and the $B$ and $a$ appearing in it are metavariables standing for terms of MLTT. On the other hand, $B(a)$ is not part of MLTT syntax, but meta-syntax standing for a term of MLTT that contains a free variable $a$. (Or is $a$ a meta-variable standing for an arbitrary MLTT term? so that $B(a)$ stands for a term that contains $a$ as a subterm? Later in the article a general term $t$ is substitute for $a$ in $B(a)$ which supports this view.) Assuming my interpretation of $B(a)$ is correct, then strictly speaking the $B$ in $B(a)$ cannot stand for a term of MLTT. Rather it would stand for a mapping that transforms terms to terms. This would be a meta-object, not part of the terms of MLTT. Todd is using the notation $B[a]$ and I suspect he means the same as $B(a)$?
Another related aspect that I have found somewhat confusing when reading the literature on MLTT is the slightly different use of the expression “$B$ depends on…” (or “$B$ is parametrised by…”).
I see Mike use it as “type $B$ depends on type $A$” (actually he writes $B$ is dependent on $A$). I assume this translates to the formal judgement $\Gamma \vdash B\colon A \to \mathrm{Universe}$ (I think this is also how the HoTT book uses the “depends” terminology). But I have also seen it used differently as “type $B$ depends on $x:A$”, which translates to $\Gamma, x\colon A \vdash B\colon \mathrm{Universe}$. One can pass from one formulation to the other using lambda abstraction resp. application, but conceptually there is a difference between saying $B$ is dependent on $A$ and saying $B$ is dependent on $x:A$. (I think of it as the difference between saying “$f$ is a function” and “$f(x)$ is a function of $x$”.) Anyway, I hope these remarks are not too much off track. I’m no expert in type theory, just someone trying to learn it mostly on my own.
Re #15: Thanks Alexis! Yes, this is exactly what I had in mind.
Re #19: I do not see this as vastly different from #2. Indeed, I would say there are the following differences (there may be more).
What is in #2 is lower level, working directly with syntax
What is in #2 is more general
What is in #2 has a distinction between terms and types built into it
While it doesn’t really matter, I am not sure which parts of #2 and #5 would be considered to be misleading or beside the point? When I read #17, #22, and #26, for instance, I see a lot of confusion ensuing, and a higher level of sophistication. What is in #2 is very simple, and from that perspective, I gave in #5 what I would consider to be very clear answers to Todd’s questions, clearer to my mind than the alternative answers that have been suggested. I don’t mind at all if the page is rewritten to use something like #19, but it would be interesting to discuss precisely what is deficient about #2 (I did not write down all permissible syntax constructions, e.g. for forming dependent product types, but I considered it evident how to continue the pattern of #2 to do so) and #5.
Well, let me just say that all the responses have been useful to me, in one way or another. In some cases, pertinent points had to be made by more than one person in more than one way, but together they formed a kind of Indra’s net in which the multiple reflections eventually clarified matters to me.
It’s probably not too important at this point, but I think one thing blocking my understanding was thinking primarily in terms of “pure” MLTT, the free structure generated from a discrete collection of (I suppose unparametrized) types. I kept thinking that insofar the whole subject is about dependent/parametrized types, there should be an introduction rule that introduced a parametrization. I began to get unconfused on this point after pondering #23, even though aspects of this point were brought up earlier in the discussion.
I think maybe an example or two on the page would help future readers.
Speaking of notation… I’d seen $B[x]$ (as opposed to $B(x)$) in Seely’s old paper and thought nothing of the notational discrepancy, and still do regard it as an inessential difference of notational convention. Am I wrong? What do I need to be aware of here?
More interesting to me are the subtle changes in font, which evidently carry meaning. It seems the ordinary $B$ in mathmode is being used for a meta-variable, whereas $\mathtt{B}$ “is an axiomatic type that is specified, as part of the “signature”, to depend on one copy of the axiomatic type $\mathtt{A}$”. Assuming I’m reading things correctly here, certainly that kind of subtlety could slip past unnoticed (or even regarded as some personal idiosyncrasy or affectation) unless carefully pointed out.
When I first read #2, I found it vague and confusing; but now after writing #19, I can see that #2 is expressing something of the same idea (though less precisely). In particular, the phrases “any string you like” and “permitted to write” seem vague to me and I wasn’t sure what they meant. Now I think I see that by “any string you like” you meant to refer to the elements of a set like the $Atom$ in #19, and that by “permitted to write” you meant that it is a well-formed string, i.e. parses to an abstract syntax tree in the sense of #19. Is that right? I still don’t see #2 as lower level or more general, though; in order to make it precise you need to say what you mean by “string”, which essentially means specifying a set like $Atom$.
Regarding #5, I don’t see how to regard $x : A, B(x) : Type, y : B(x) \vdash C(y) : Type$ even as well-formed raw syntax. All the typing declarations on the the left side of a turnstile must be variable declarations, and $B(x)$ is not a variable. Nor, in this type theory, are we dealing with universes yet, so “$: Type$” can’t appear on the left side of a turnstile either.
Regarding #7, the two lines above “Binary product types”:
For any context $\Gamma$, if $A$ is an atomic type, $\Gamma \vdash A : \mathrm{Type}.$
If $x : A$ is a typing declaration in $\Gamma$, then we have $\Gamma \vdash x : A.$
do in fact describe actual primitive rules that must be present in type theory. The first is the rule
$\frac{ \quad } {\Gamma \vdash \mathtt{A} \, type}$which I mentioned in #17. Every element of the set $Atom$ must come with such a rule, otherwise there is no way to use it. It sounds like maybe you would rather allow elements of $Atom$ (what you call “strings” in #2) that don’t necessarily come with such a rule; there seems little point to that, since then they cannot appear in any valid judgments, but anyway in that case you could just redefine $Atom$ to be the subset of atomic types that do come with such rules.
The other is the “variable rule”
$\frac{ (x:A) \in \Gamma} {\Gamma \vdash x:A }$Both of them are necessary in order for any typing derivation to “get off the ground”.
I think one thing blocking my understanding was thinking primarily in terms of “pure” MLTT, the free structure generated from a discrete collection of (I suppose unparametrized) types. I kept thinking that insofar the whole subject is about dependent/parametrized types, there should be an introduction rule that introduced a parametrization
This is a good point. Although the rules permit dependent types, most of them do not introduce dependency where it was not already present. I believe the basic ways that type dependency arises in MLTT are:
Regarding notation: I was indeed using a different typeface $\mathtt{B}$ to distinguish atomic/axiomatic types (which belong to a set specified as part of the signature, and have no “internal structure” from the perspective of the theory) from arbitrary types (i.e. abstract syntax trees such that $B \, type$ is either valid or hoped to be valid). Although technically I believe both are metavariables: $\mathtt{B}$ is a metavariable ranging over atomic types, while $B$ is a metavariable ranging over syntax trees.
(More later about $B(x)$ and $B[x]$…)
So about those brackets. There seems to be a common (though not universal) convention whereby round parentheses are used for actual syntax and square brackets are used for meta-level operations on syntax. So for instance when we write $\lambda (x:A). a$, the parentheses are part of the syntax: pre-parsing they are actually tokens in a string, post-parsing they are just part of the notation for a node of an abstract syntax tree with the label $\lambda$ and the three children $x$ (a variable), $A$ (another syntax tree), and $a$ (another syntax tree). Similarly, when I wrote $\mathtt{B}(a)$, the parentheses are part of the syntax: $\mathtt{B}$ is an atomic type which is declared as part of the signature to depend on one argument of type $\mathtt{A}$ (say), and $\mathtt{B}(a)$ is a notation for a node of an abstract syntax tree with the label $\mathtt{B}$ and one child $a$ (another syntax tree).
Similarly again, $apply(f,a)$ is likewise a notation for a node labeled by $apply$ and with two children $f$ and $a$. But it gets confusing because often we simplify our notation by writing $apply(f,a)$ as just $f(a)$ (to match common practice in mathematics) or $f a$ (to match common practice in functional programming and $\lambda$-calculus). This is fine as long as we don’t get confused. In particular, my notation $\mathtt{B}(a)$ was not a simplification of $apply(\mathtt{B},a)$; the latter doesn’t even make sense, because $\mathtt{B}$ is not well-formed syntax on its own. Remember there are no universes at the moment. If there were a universe, called $Type$ say, then instead of an “atomic type” $\mathtt{B}$ depending on one argument of type $\mathtt{A}$, we could instead postulate an atomic term $\mathtt{B}$ of type $\mathtt{A} \to Type$, and then we could indeed write $apply(\mathtt{B},a)$ to obtain an element of $Type$, which we could then coerce (explicitly or implicitly) to an actual type. The HoTT Book uses Russell universes, meaning that the latter coercion is implicit, and an exhaustive tower of such, meaning that every type belongs to some universe, and thus we can use the latter mode almost exclusively. But they are distinct.
Now something that we often do with syntax is substitute for variables, and the common notations for this involve square brackets. I usually write $M[a/x]$ for the result of substituting the term $a$ for the variable $x$ in the term $M$; other notations include $M[x/a]$, $[a/x]M$, and $M[x\mapsto a]$. This is not part of the actual object-level syntax: it doesn’t represent any node of an abstract syntax tree. (Well, there are type theories in which “explicit substition” is an object-level syntactic construction, but let’s keep it simple and not go there right now.) Instead it’s an operation that acts on such trees. For instance, if we substitute $apply(f,a)$ for $x$ in $\lambda(x:\mathbb{N}). (x+x)$, we get $\lambda(x:\mathbb{N}). (apply(f,a) + apply(f,a))$. We can thus write
$(\lambda(x:\mathbb{N}). (x+x))[apply(f,a) / x] = \lambda(x:\mathbb{N}). (apply(f,a) + apply(f,a))$
as long as we remember that this $=$ denotes a syntactic equality of terms, which is distinct from (and stronger than) both the object-level equality type and even the judgmental equality judgment.
Under these notational conventions, there’s nothing for $B[x]$ to mean. We can’t substitute something without knowing what we’re substituting it for. I think that probably when Todd wrote $B[x]$ he had something in mind more like my $\mathtt{B}(x)$. However, there are a couple of meta-level meanings that are sometimes given to notation like $B[x]$. One is that we might choose to write $B[x]$ to mean “a term which contains $x$ as a free variable”, allowing us to then write $B[a]$ instead of $B[a/x]$. More precisely, one could in this case consider $B$ to be a meta-variable for a “term with a hole”, and the notation $B[-]$ indicates the term obtained by substituting something in for that hole. We can thus regard a “term with a hole” as a presentation of a meta-level function from terms to terms, and $B[a]$ as the meta-level application of this function to the term $a$. If we go even further and represent terms using higher-order abstract syntax, then we could use the same notation $B[a]$ for the literal meta-level application of a function to a term, to distinguish it from the object-level notations like $f(a)$ for $apply(f,a)$.
Finally, regarding “depends on”: I would include the variable $x$ in the phrase if and only if the variable $x$ actually appears freely in the term under discussion. So, as Michael said, a term $B$ of type $A\to Type$ could be said to “depend on $A$”, as there is no variable $x$ around. On the other hand, a type $B$ in which $x$ occurs freely, and for which $x:A \vdash B \, type$ (or perhaps $x:A \vdash B : Type$) is valid, can be said to depend on $x:A$ because the variable name matters. We can get from the first to the second by application, and the second to the first by abstraction. In all of this, $B$ is a metavariable standing for a term which could contain free variables.
However, when I said that the atomic type $\mathtt{B}$ “depends on $\mathtt{A}$”, I meant something different from both of them: that the signature declares that $\mathtt{B}$ depends on one argument of type $\mathtt{A}$. The resulting syntax allows us to write $\mathtt{B}(a)$ whenever $a:\mathtt{A}$, and in particular if $x:\mathtt{A}$ is a variable then we can write $\mathtt{B}(x)$ to obtain a type that “depends on $x:\mathtt{A}$” in the second sense above. But $\mathtt{B}$ itself is not yet a well-formed piece of syntax, it’s an element of the signature which only becomes well-formed in the construction $\mathtt{B}(a)$. So in particular $\mathtt{B}$ does not contain any free variables, nor does it belong to a function type.
Thanks for those clarifications Mike, that helped me.
I’m glad we’re having this conversation right now, by the way, because in the Initiality Project we’ll want to explain all these aspects of syntax as clearly as possible for newcomers/outsiders, since the goal is to be as expository as possible for a wide audience.
As you might guess, that’s also why I started this thread: to prepare myself for the Initiality Project. The Martin-Löf page is one of the most developed pages we have that treats the parts of type theory that CwF’s directly interact with.
I also appreciate that Todd started this thread.
Maybe for outsiders like myself —who were not trained to think carefully about syntax, let alone to treat it as a mathematical object— it might help to use some notational conventions that distinguish more clearly between syntax and meta-syntax. I’m imagining things like different fonts (or font colors) for metavariables, for substitution or for the stronger equality symbol that Mike used in #32.
Personally, it took me quite some time until I understood that the notation for substitution was not part of the syntax of type theory. And still now I’m not always sure if certain symbols are part of the “object syntax”, part of the “judgement syntax” or part of the even higher meta-level language that’s used to talk about judgements. But maybe it’s just me who gets confused by these things.
It’s not just you!
I guess #19 then is laying out the well-formed strings in the object syntax – or at least describing in a conventional shorthand the construction trees. Then it’s sort of left to the rules (for deriving judgments) to sort out what is what (which linguistic items correspond to terms, which to types, etc.).
Right. It’s a choice whether to define all the raw types+terms together like that in one inductive definition, or to define the raw types and the raw terms separately by a pair of mutual inductive definitions. At this point in my life I am gravitating towards the former, partly because it seems to more naturally support universes a la Russell where the same syntax is used for a term belonging to a universe and for its corresponding type.
Oh, I forgot to reply to this:
do you call the judgments that appear “above the bar” in the application of a rule premises, or hypotheses (or either/or)? And do you call the judgment “below the bar” the conclusion of that rule? Unless I’m advised differently, I think I would tend to say “premises” and “conclusion” (of the rule-application).
I think that “premises” and “conclusion” is also what I’ve been taught for the judgments above and below the bar of a rule, although I still have to stop and think every time I use them to make sure I’m not mixing them up with the “hypotheses” or “antecedents” on the left-hand side of a turnstile and the “consequent” on the right-hand side of a turnstile.
Re #29: thanks very much for your reply, Mike! I find it very interesting, because I think the objections you raise are not in fact obstructions to understanding type theory as in #2. This is perhaps most clear in the following.
I still don’t see #2 as lower level or more general, though; in order to make it precise you need to say what you mean by “string”, which essentially means specifying a set like $Atom$.
The main aspect of what I was saying was that a set like $Atom$ is not needed to say precisely what a type theory is. I am taking a string as a primitive concept, at the level of human agreement: literally consecutive characters on a page (or a screen). This is why I say that what I am doing is lower level, and has no priori mathematical significance: I am literally working at the level of writing symbols on a page/screen. In particular, I was not trying to define type theory as a mathematical object in some existing foundations: in that case, I completely agree that #19 is better.
Moreover, I think that substitution, variables, and all that are not needed to understand type theory formally. Of course they are being used in #19, but if one works at a lower, completely syntactic level (manipulating strings), as I do in #2, they are not needed, and in my opinion one obtains a considerable simplification ontologically.
Regarding #5, I don’t see how to regard $x : A, B(x) : Type, y : B(x) \vdash C(y) : Type$ even as well-formed raw syntax. All the typing declarations on the the left side of a turnstile must be variable declarations, and $B(x)$ is not a variable. Nor, in this type theory, are we dealing with universes yet, so “$: Type$” can’t appear on the left side of a turnstile either.
As above, the objections here are not fundamental to understanding what a dependent type theory is. If we write instead $x : A, y : B(x) \vdash C(y) : Type$ or whatever one’s favourite way of writing it is, the meaning is the same. Note that I referred to this kind of syntax as shorthand/notation (’we may write’), i.e. exactly what one chooses to write is not significant. The only thing that is really important is that one defines how to form new pieces of syntax given previous ones: whichever shorthand one chooses just has to allow one to understand why what is meant is legitimate.
…do in fact describe actual primitive rules that must be present in type theory. The first is the rule
$\frac{ \quad } {\Gamma \vdash \mathtt{A} \, type}$which I mentioned in #17. Every element of the set $Atom$ must come with such a rule, otherwise there is no way to use it.
As above, there is no set $Atom$ in#2. What is a type in #2 is defined inductively (allowing primitive/axiomatic declarations), and is defined purely by being legimitate syntax which is a valid judgement. Once it is judged a type, it can be used. And again, I consider this (radical?) simplicity to be a much clearer way to proceed.
Similarly for the other rule: it is bookkeeping needed in a certain way of saying what a type theory is, but is not fundamentally necessary; and is not necessary if one proceeds as in #2.
In conclusion, I would like to strongly counter the objection that what is in #2 is not precise or formal. It is completely formal and rigorous, and completely precise: it is easy to write down code which will only allow one to make syntactical constructions/validity judgments as in #2, using one’s favourite language’s notion of a string.
To summarise, what I would say (and this is what I was trying to convey in #9) is that type theory is ultimately a formal language, it is something that one writes down. What one cares about are judgements that a certain term belongs to a certain type, and one has rules for how to make such judgements. From #2, one can always produce a written down derivation that is the same as one which one would produce as in #19. The difference is that everything in #2 is completely explictly part of the same syntax, there are no pre-existing notions of ’variable’, ’atomic type’, and so on.
It seems to me contradictory to say that what you’re doing is not a mathematical object and yet is completely formal and rigorous. I am not familiar with any non-mathematical notion of rigor that could be applied here.
I should definitely clarify that! I mean that it is not a mathematical object in any pre-existing sense of mathematical as a formal system. It is mathematical in the deeper/human sense of defining and studying some entity according to certain rules, and everything expressed in it is mathematics. This is how it should be for a foundations that can stand alone!
But if that is too philosophical, one can also take the following working definition: the formal language can be defined in a programming language. That is unarguably at least as rigorous as defining things on paper/on a screen at the formality level of, say, #19.
It may help if I note that, by contrast with some earlier comments, I see a big difference between working at the level of strings and working at the level of a set of atoms, precisely because a string is at the same primitive ontological level as that at which one actually writes/types when working in a type theory, i.e. it is all ultimately just writing in the ordinary human sense.
1 to 45 of 45