Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
Holy cow, Mike!! I quickly looked at all your recent pages – thanks so much!!
I’ll probably be needing to take a harder look at the Plan and see how much I “get”. I will resist the temptation to say more now, before having that look.
Here is a very basic question: application has the form $App^{x:A.B}(M,N)$. I’m guessing that $A$ and $B$ are supposed to be the types of $M$ and $N$. But why do we need to mention the variable $x$ in such term?
The type annotation $x:A.B$ here is to note that the type of $M$ is $\Pi_{x:A} B$ (in a well typed term at least), so the $x$ is needed because it is bound in $B$. For instance $M$ could be $\lambda x:A. refl(x)$ which has type $\Pi_{x:A} x = x$
Right. Although actually $M$ would have to be $\lambda(x:A.x=x) .refl(x)$, since as was pointed out on the other thread the codomains of abstractions also have to be annotated in general.
The type annotation $x:A.B$ here is to note that the type of $M$ is $\Pi_{x:A} B$ (in a well typed term at least), so the $x$ is needed because it is bound in $B$.
Oh, of course. That makes sense. When we mod out $\alpha$-equivalence, will there be a notational distinction between a raw term and its equivalence class?
Thanks for working on this, everyone! I wonder whether we could do this at a higher level of abstraction, so that we wouldn’t need to add new clauses to the definition of substitution and $\alpha$-equivalence every time we add new type formers. For instance, I was just looking at Bob Harper’s Practical foundations of programming languages, and he defines a general notion of “abstract binding tree” that seems to support substitution and $\alpha$-equivalence for “any term system one could give”, although I don’t completely understand it.
which I might get around to fixing
Oh, maybe I shouldn’t. I don’t want to steal John Dougherty’s job.
Please do fix them, if you like! I put my name next to this task on the Initiality Project page because I was volunteering; I didn’t mean to claim this page as mine or anything.
Also, thanks for pointing out these issues. I think I might have a look at Harper’s discussion of abstract binding trees before trying to fix these problems (if you don’t fix them first). It’s true that it would be a pain to redefine all this with each new type former.
An elementary question: are $\Pi$, $\lambda$, $\mathrm{App}$ and the colon $:$ considered to be constants? (I’m assuming parenthesis and the other punctuation marks like , and . are not constants of the raw syntax). If none of the above are constant, then what would be an example of a constant in the type theory under consideration?
No, those are not constants. $\Pi$, $\lambda$, and $App$ are “constructors” of the raw syntax, i.e. labels that can be put on a node of an abstract syntax tree. The colon is like the other punctuation marks, part of how we notate abstract syntax trees and judgments in concrete syntax but has no independent meaning. In the language of n-theories, $\Pi, \lambda, App$ are part of the “doctrine”, while the constants are what specify a particular “theory” in that doctrine. For instance, if we wanted to study categories inside dependent type theory, we might consider a theory (or “signature”) containing a type constant $O$, another type constant $x:O, y:O \vdash A(x,y) \, type$, a composition constant $x:O,y:O,z:O,f:A(x,y),g:A(y,z) \vdash comp(x,y,z,f,g) : A(x,z)$, and so on. We can also use constants to implement axioms, such as function extensionality or univalence.
I see, thanks for clarifying.
We should include some examples somewhere. I wonder whether they should be interspersed, or collected all in one place?
I wonder if it is better to say something else, such as “preterm”, instead of “raw term”. It would enable us to avoid the non-subsective adjective “raw” in the phrase “raw term”: a raw term is not a term. I like the remark of Gerard Debreu [Theory of value, 1959, chapter 1, page 27, note 1], explaining why he prefers to say “correspondence” instead of “multi-valued function”:
A correspondence (N. Bourbaki’s term) has often been called a multi-valued function. Locutions such as these, where the object named by a noun and an adjective is not in the class of objects named by the noun alone, have been avoided here.
The usage “preterm” has precedents, as in Sørensen and Urzyczyn [Lectures on the Curry-Howard isomorphism, Section 1.2].
I think one can argue that a raw term is a term. It’s just not necessarily a well-typed term.
a raw term is a term.
Can you please explain? According to the last paragraph of the page, a term is an $\alpha$-equivalence class of raw terms. I don’t understand how a raw term is an $\alpha$-equivalence class of raw terms.
Hmm, I didn’t write that paragraph. I’m not sure I would agree with it. To me “raw” indicates possibly-ill-typed-ness, not the presence of variable names, and I would rather not introduce a clumsy word like “preterm” merely to be pedantic about $\alpha$-equivalence.
But in any case I would rather not argue about terminology at this point in the project, when there are so many better arguments to have. (-: Are you actually confused by the mathematics, or just not liking the terminology?
Hmm, I didn’t write that paragraph. I’m not sure I would agree with it. To me “raw” indicates possibly-ill-typed-ness, not the presence of variable names
I will wait for someone to rewrite the page according to what you are saying.
I would rather not introduce a clumsy word like “preterm” merely to be pedantic about $\alpha$-equivalence.
But in any case I would rather not argue about terminology at this point in the project, when there are so many better arguments to have. (-: Are you actually confused by the mathematics, or just not liking the terminology?
I don’t mean to be contrary; I am only trying to say that the overview of the project seems to start from the basics of type theory, and I think there was some mention of explaining the subject to mathematicians who don’t know anything about it. If one is explaining the basics of something to a novice, perhaps it is better to be precise even if it is pedantic. I understand it is a matter of personal taste, but I find the Bourbaki style of exposition very satisfying; I wonder if it is considered pedantic.
In any case, I feel uneasy when people ask me to identify things, especially at an early stage of learning a subject. A germ is a function. An element of $L^2(\R)$ is a function. Sure, but it takes me a while to get used to such identifications.
In the present context, I am afraid that we will slip into identifying a raw term with an $\alpha$-equivalence class of raw terms as in several books that I have tried to read, and hand-wave verifications of invariance of various operations under $\alpha$-equivalence.
As you can see, my confusion is due to a mixture of ignorance and a difficulty with the accounts of type theory that I have tried to learn from.
Reading what I have written, I feel that I am not very clear. Anyway, it will do for now.
Thank you for doing so much work for this project, and for making it open to everyone.
It looks like there’s just been a misunderstanding about what’s “raw” about “raw terms”. There are three types involved:
I would also say that raw terms are terms—the “raw” is just for emphasis—but I usually use de Bruijn indexes, so I don’t usually need to worry about whatevers that remember variable names.
Proposal: “nominal terms”, “terms”, and “valid terms”
Speaking of arguing about terminology, I’m actually not so happy about calling that presheaf $Tm$. Those guys generally aren’t syntax. Is there a problem with $El$?
I don’t think we need a special name for $\alpha$-equivalence classes of terms. We can just say that we consider terms up to $\alpha$-equivalence. Just like we don’t have a special word for “set up to isomorphism” when regarding sets “structurally” as objects of the category $Set$.
And I think $Tm$ and $Ty$ are pretty standard for the presheaves in a CwF. They’re “semantic terms”. (-:
Correct definition of capture-avoiding substitution: when substituting for a bound variable, we still have to carry out the substitution in the parts of the term that the binder doesn’t scope over; and when substituting into $App$ we do have to do it in all the subterms (not sure why that was treated differently).
Just like we don’t have a special word for “set up to isomorphism” when regarding sets “structurally” as objects of the category $Set$.
I didn’t realize that was analogous, since a “higher quotient” is normally not available, to formalize working up to isomorphism. And when it is, I didn’t think $Set$ was defined as a higher quotient.
The point is that we do just fine in practice without a higher quotient, and we so can similarly do without “lower quotients”.
When higher quotients are available, as in HoTT, the category $Set$ whose objects are h-sets doesn’t usually need to be quotiented (i.e. Rezk-completed) because univalence makes it already Rezk-complete. But if we defined it based on some stricter notion of set, for instance ZF-like sets built inside HoTT, then we would need to Rezk-complete it to obtain something that behaves like the usual category of sets.
I think we should get rid of the 3 cases with side condition $y = x$, and just make the substitution operation undefined in those cases. Once we move to $\alpha$-equivalence classes, we can always assume that $y \neq x$ since it can be renamed to something different from $x$. As it stands we have to prove that these 2 cases do the same thing on $\alpha$-equivalent terms to show we respect the quotient, so the extra cases are just unnecessary work. You can always prove after the fact that those cases are correct and well-defined for $\alpha$-equivalence classes of raw terms. This is the usual way of doing things, it’s called the “Barendregt convention”: any bound variable is always different from any free variable. If that sounds reasonable to others I’ll delete the extra cases.
Then I would separately define the notion of renaming a single variable, which is used in the definition of $\alpha$ equivalence currently, to avoid intertangling the definition of $\alpha$ equivalence with our general substitution function.
Re #32 that sounds reasonable to me. I don’t understand the jargon in #31, can you explain?
Re #33: We are currently defining “FV” sets. This is leaves-to-root propagation of variable information, analogous to Andromeda’s context joining. The approach usually favored for formalization of nominal expressions mostly/entirely avoids “FV” and keeps track of reserved variables on the way to the leaves, analogous to the usual context propagation. Variables not currently reserved can be safely considered fresh for the current subexpression, so you don’t need “FV” in side conditions. (This requires knowing the FV of the expression at the outset, to reserve them, but you usually start from closed expressions. You know. Same argument as for root-to-leaves context propagation.) Once you have a working notion of freshness, you can rename bound variables to fresh variables (edit: and then reserve them) as you enter binders. This is what I meant by “fresh-renaming-style”. All of this root-to-leaves stuff is used by Harper’s ABTs (and multiple other developments).
As for alpha equivalence not being a congruence, that was just a reminder of the same bug report from #9.
Okay, I think I sort of see. I don’t really have an opinion, if that way is considered better then I’m fine with it.
#9 also mentioned simultaneous substitutions; do we need those defined separately? I’m thinking right now that I want to collapse the “weakening/exchange/contraction” and “substitution” lemmas at Initiality Project - Substitution into one lemma about simultaneous substitution (i.e. along a “context morphism”), to avoid doing the same work twice.
I’m worried about the “$x = y$ or $x = z$” cases in the new definition of renaming. In case $x = z$ and $x \neq y$, it doesn’t seem right to say the body doesn’t need to be renamed. It should give up. I think the old definition of renaming as a special case of partial substitution did this right, and I don’t see why we need renaming to be separate.
Thanks for finally (mostly?) fixing alpha equivalence, Max.
Oof yeah not sure what I was thinking with that side condition.
Correct definition of alpha-equivalence, I think :).
We define substitution as a partial function, then in the definition of substitution, we make sure that “fresh” variables are different from all free and bound variables in the term, so that it is clear that our use of substitution is well defined.
Is every variable secretly just a 0-ary term constant?
No, and in fact, $\Pi$, $\lambda$, and $App$ are not constants either. A raw term node is either a variable, an instantiated term constant, or an instantiated primitive term operator. A raw type node is either an instantiated type constant or an instantiated primitive type operator.
The definition currently “hard-codes” the primitive operators. This may change in the future. The definition is parameterized by the sets of variables, term constants, and type constants.
One reason why it’s not a good idea to treat variables as constants is because substitution handles variables and constants differently. In general, variables and constants are understood differently.
so that a more formal definition of raw terms and raw types as rooted trees labelled by constants would go like this…
It’s also not a good idea to treat the binding sites as subtrees, since they’re never handled that way.
Is seems to me that in these abstract syntax trees, the children of a given node have to be (linearly) ordered.
Since we’re adding operators as we go, there’s technically nothing preventing us from adding an operator so that the node’s subtrees are not in any particular order. But I don’t expect that to happen.
shouldn’t the node D have only one child?
Yes, that must be a typo; I’ve fixed it.
I don’t see what change you made in #44 though?
In the definition of $X^V$, should the later freshenings be with respect to the previous $x_i$ (what’s written now) or the freshened versions of the previous $x_i$? (e.g. $x_2^{V \cup {fr(V,x_1)}}$ )?
Dan’s alternative sounds right to me, unless the English description is confusing me.
$(\cdot)^{[V]} \coloneqq \cdot$
$(x\,::\,X)^{[V]} \coloneqq let x' \coloneqq x^{[V]} in x'\,::\,X^{[V \cup \{x'\}]}$
Thanks for catching those typos.
The inductive definition of abt looks awkward to me. The family is indexed by $V \in P_{fin} (Var)$ and $s$ so it doesn’t really make sense for the constructors to say “for each $V \in P_{fin} (Var)$”.
Obviously we will later make this rigorous, but I changed this now because I was confused by this reading it.
I don’t see anything wrong with it. When defining an inductive family, each constructor has to specify what the index of the element being constructed is. In general it doesn’t have to be a variable, e.g. the constructors for the $\mathbb{N}$-indexed family $Vec(A)$ of vectors (length-indexed lists) are
These cannot be rewritten by putting the “for any $n$” out at the beginning. It’s true that the definition of ABT is special in that every constructor applies to an arbitrary index $V$ (sometimes this is called a “uniform index” I think? Or maybe a “non-uniform parameter”?), but it’s certainly not wrong to write it the way I did, and it makes more sense to me that way because it matches the more general syntax. So I reverted it. What exactly is it that you find confusing?
We generate the set when we vary the parameters of the constructor. We don’t vary $V$ so it seems silly to put it as a parameter of the constructor. I suppose it doesn’t make too much difference but it seems clearer the way I write it.
We have a inductive definition indexed by $V$ and $s$ rather than an inductive definition that also parameterises over $V$.
I am not sure what you mean by “we don’t vary $V$”. The second constructor has inputs that are indexed by different $V$’s (namely $V\cup X_i$) than the $V$ that indexes its output. This means that we are not defining a family of inductive types individually one for each $V$ (that would be when $V$ is a “parameter” – perhaps you are using the words “index” and “parameter” backwards from the standard?), instead we are defining the whole collection of types indexed by different $V$’s all at once. In particular, each set $Raw(V,s)$ does not have its own induction principle; only the family of all these sets together has a mutual induction principle. This is the same thing that happens for the index $n$ of $Vec_n(A)$, but different from what happens for the parameter $A$ of $Vec_n(A)$, where all the recursive inputs to each constructor involve the same value of $A$ as the output; the latter is the only case when I think it makes sense to say that “we don’t vary $A$”.
I have confused myself. I think I have cleared it up now. Ignore what I have said before, I think it is complete nonsense. However I still think Harper’s notion of ABT and the one written here are quite difficult to understand. I will try to play around with it a bit more.
Also, it is mentioned in a few places that we don’t want type variables. Would it do any harm to leave them in? I think we would only need to modify the variable constructor of abts to also include a sort (which will almost always be tm).
Ordinary dependent type theory, which is what we are trying to interpret, does not include type variables. Moreover, there is no way to interpret them in a CwF (although they can be interpreted in the “semantic logical framework” of the presheaf category).
Why do we only define substitution on terms? Why not any sort?
Also on notational convention: $M[N/x]$ or $M[x/N]$ the latter has the advantage that it is close to $M[x:=N]$, I’m not sure what the former is good for.
Well it looks like we have substitution of terms for variables in expressions of any sort, which is what we want, since we have only term variables.
I read $M[N/x]$ as “$M$ with $N$ for $x$”, and $M[x \coloneqq N]$ as “$M$ with $x$ set to $N$”. I don’t personally like $M[x/N]$, but of course it doesn’t really matter a whole lot.
I suppose as long as we are consistent and it is obvious from first glance the order doesn’t matter. I seem to remember Barendregt prefering $M[x:=N]$ and Harper/HoTT book prefering $M[N/x]$. I couldn’t find anything about $M[x/N]$ maybe I just made it up? I suppose I am confusing the two.
I’ve seen both $M[x/N]$ and $M[N/x]$ used by different people, unfortunately. Some people also write $[N/x]M$. I like $M[N/x]$ because it looks like “applying” a “function” $M$ to the “argument” $N$, with a side note that $x$ is the variable that we consider $M$ to be a “function” of. As a mnemonic, I can also think of $N/x$ as $N$ coming down from the top to squash and replace all the $x$’s.
1 to 67 of 67