Broadly speaking, to formalize type theory you don’t need telescopes. If you decide to carry out the development using intrinsically-well-scoped terms though, it’s natural to separate possibly-open contexts into closed contexts, and telescopes, which are basically contexts that are well-scoped in some context.

Edit: More intuitively, telescopes are like contexts that are well-typed (not just well-scoped) in some context. If you only make the scoping intrinsic though, then the difference in scoping is technically the only difference between contexts and telescopes. If scoping is not handled intrinsically either, then contexts and telescopes are technically the same. (E.g. lists of (variable,type expressions) pairs.)

]]>Do we really need telescopes? In that Harper’s treatement of judgements should make them behave correctly no?

]]>The constants and their rules are a parameter of the whole theorem. The idea is that you assume some dependent functions and equations about them, except they’re operators, not functions, and the equations are judgmental equality. I think.

See also: Initiality Project - Overview#axioms

]]>What exactly is meant by constant here? What are rules for constants?

]]>From what I am reading, logical frameworks require a dependently typed calculus as a meta language…

I tried to explain in my big parenthetical remark in #226 that this is not the case for extrinsic-style presentations of deductive systems.

But in general, I see a disparity between type theorists on what exactly a judgement is. Is it in the theory or metatheory. I don’t think there is a correct answer, or at least I haven’t found a convincing argument for either other than ease of use.

It seems to me that there’s agreement that the primary role of judgments is always to be something external relative to types. To a large extent, type constructors end up reflecting external structure internally, but not everything is reflected. For example, unless you have equality reflection, there’s a difference between the equality judgment form and an equality-like type constructor. There’s almost always a difference between the typing judgment form and any type constructor. If not, you’re in a very special case of type theory known as material set theory. ;) Universes reflect the “is a type” judgment form into a special case of the typing judgment, but unless you like inconsistency, the reflection is incomplete. Function types give you internal Homs. Blah blah blah…

Type theorists evidently disagree about what additional properties judgments should have, beyond the little that’s needed for them to serve their primary role of providing a setting to specify type constructors.

]]>From what I am reading, logical frameworks require a dependently typed calculus as a meta language, which means for initiality of dependent type theory it would be silly to use it as a meta language anyway.

I would agree with Mike for this project. But in general, I see a disparity between type theorists on what exactly a judgement is. Is it in the theory or metatheory. I don’t think there is a correct answer, or at least I haven’t found a convincing argument for either other than ease of use.

]]>I’m very much in favor of logical frameworks in general; I think that may be the right way to go about proving a general initiality theorem. I’ve thought seriously about doing this with both dependent LF-like approaches (which would presumably depend on an initiality theorem proven in some other way for the dependent LF theory itself) and FOL versions (essentially abstracting the partial-interpretation approach, probably starting by interpreting the raw syntax using an initiality theorem for simply typed HOAS as I mentioned here and then building some kind of logic on top of that). Both of these are on my list of projects to do someday if no one else does them first (and if anyone is interested in working on them together, I’d be open to that too).

The main reason I don’t want to use a logical framework for this project is that this project is not supposed to be doing something new. The goal here is to clear up confusion about the *existing* Streicher-style methods for initiality proofs (and streamline them a little). Along the way we are generating and bringing up lots of good ideas for how to improve things in the future, and that’s great, but we need to rein in our enthusiasm and stick to the point for now. (I now include my own temporary obsession with “categories of variables”, and resistance to de Bruijn indices, in that comment. It only lasted for so long because since I’m the organizer, no one else had the authority to tell me to shut up about it; instead you all had to wait for me to give up on it on my own.)

I think that would be very confusing, given the rest of the setup, where we’re not using judgments-as-types to handle syntax and typing.

(I think it would be very smart to use a logical framework, and judgments-as-types, to handle initiality. But Mike seems against it for this project. I’m not sure Mike has considered all the options though. In addition to fully-dependent judgments-as-types based on LF, there’s also judgments-as-propositions based on various stripped down predicate logics. For extrinsic judgments-as-types, you only use the judgments as propositions because nothing depends on derivations; only expressions. Isabelle and λProlog work like that, using HOL. I think Delphin uses a variant of FOL.)

]]>Should judgement forms be part of the syntax?

So $\Rightarrow, \Leftarrow, \equiv_{Ty}, \equiv_{Tm}, Type$ would all be operators. Their sort could be “judgement”? The sorts of their arguments are easy to work out. That way contexts are simply the bound variables associated with the operators in the binding tree?

Or if that doesn’t work, have a sort of “contexts”, then an operator : which would take a variable and a type and form a “context”. An operator , for concantenating contexts together. Finally the turnstyle could take in our previous judgement and a context and make the (not so basic) kind of judgement?

Just some ideas.

]]>By the way, if we include explicit $\alpha$ equivalence all over the place, like in #216, I think we could prove a sort of quotient induction principle: Doing induction, you wouldn’t worry about each particular equivalence on each constructor, you’d just prove overall that your motive respects $\alpha$. This is probably the right induction principle for that approach…

Come to think of it, this proposal is basically just spelling out what it means to work with relations with quotiented indexes. Since informal math seems to have quotients, it seems better to work with syntax quotiented by $\alpha$ equivalence than to imitate it in the completely unimaginative way of this proposal.

]]>(Warning: Haven’t read #222 yet, but posting anyway.)

Re #217:

I agree there is a mismatch between viewing rules algorithmically and semantically; I don’t quite know how to make sense of the algorithmic ones semantically.

(I’m not really sure why you call this semantics. It’s just reasoning about $\alpha$ equivalence with fancy category theory, right?)

I think you figured out the key idea later on, in #218:

One reason I mention this is that it gives a way of distinguishing the “functional” judgments with output modes, as being actual (partial)

functionsdefined at the LF level, using the recursion principle of the inductive type families $tm$ and $ty$, rather than mere inductively defined relations.

Never mind LF or recursion; as long as your “semantics” is thinking of the judgment form as a partial function, you will avoid the (conjecturally) unnecessary $\alpha$ respect for output positions. It doesn’t matter whether you then “implement” this with recursion or inductive families.

This is what I was trying to say back when I came up with it in #182:

For type synthesizing rules, there seems to be an interesting possibility: Since the type is an output, the rule gets to pick the name of a binder. It doesn’t need to respect $\alpha$ equivalence in the usual sense. The idea is that we convert a

functionthat respects $\alpha$ into a relation, rather than simulating converting a function on equivalence classes into a relation on equivalence classes by some general machinery like nominal logic or quotients. I think this will be easier, if it works.

Your approach from #211 seems more like “general machinery”.

Re #218:

I can’t figure out what you’re getting at with all that code. But from #219, it sounds like it doesn’t work yet.

By the way, if we include explicit $\alpha$ equivalence all over the place, like in #216, I think we could prove a sort of quotient induction principle: Doing induction, you wouldn’t worry about each particular equivalence on each constructor, you’d just prove overall that your motive respects $\alpha$. This is probably the right induction principle for that approach, not something from nominal logic. Got confused. This is not the “usual” sort of quotient induction from QITs, because the derivations themselves aren’t quotiented. Actually, it doesn’t matter whether the derivations are quotiented, since we only need a non-dependent elimination form.

I think this is what you’re calling a “Dybjer inductive family”? And we could convert it to a “Martin-Lof inductive family” by including identity-types at the LF level (these being interpreted as $\alpha$-equivalence in the $Gpd$ models).

Dybjer families are inductive families with only “Protestant” indexes. Martin-Löf families have “Catholic” indexes. That religious terminology is apparently some weird joke. I don’t get it though; ask Dan. Protestant indexes have to be completely general in the conclusion, but can vary in premises. I think Protestant indexes are also called pseudoparameters. In Coq, the syntax is the same as for parameters.

Dybjer families alone cannot define identity types. LF signatures correspond to inductive-inductive families with Catholic indexes though, so I’m not sure what you mean about including identity types in LF. You can just declare identity.

Pseudoparameters are ostensibly perfect for input positions, if the rules really follow the mode discipline. For the pattern matching, you use actual large eliminations. But the large elimination somehow ends up being a pain in the butt, in Coq. (For example, you have to prove the induction principle yourself. And `simpl`

gets all eager, unless you tell it not to. And the default implicit arguments it chooses is too conservative.)

Okay, brace yourselves – I’m considering changing my mind once again. (No hobgoblins of little minds here…)

After exploring lots of options, it seems that no matter what tack we take, directly proving invariance under $\alpha$-equivalence is going to be either technically intricate, tediously annoying, or both. I’m gradually coming to feel that this difficulty outweighs the advantages I saw in using named or CoV syntax. Moreover, while CoV and related ideas are worth pursuing somewhere, the goal of this project should not be to revolutionize the world of syntax, but to take a geodesic route to an initiality theorem that is *reasonably* modular, though not striving for excessive generality.

As anyone who’s worked with me before knows, I have a tendency to get carried away with exciting new ideas. This is maybe not a very good attribute in the coordinator of a project like this, whose goal shouldn’t be developing really new technology but only applying the existing technology, with a few tweaks to make it more modern, comprehensible, and modular. And as I’ve said before, those goals suggest choosing presentations of syntax and semantics that are as close as possible to each other, leaving the translations to other presentations to happen purely in one realm or the other.

All of that now inclines me to recant my opposition to de Bruijn indices. As Dan said in #163,

if the idea is to make the two sides of the theorem as close as possible, then it seems like using de Bruijn for this part, and relying on a prior translation from a named representation to de Bruijn, would be acceptable.

My only response to this was that I didn’t want to actually use de Bruijn syntax when writing out the proof (and I didn’t want, and still don’t, to write down named syntax and pretend that it is de Bruijn syntax — if the syntax is actually de Bruijn, we should be honest and write it that way). But after thinking about whether it would be practical to use CoV for the proof, I observed (#209)

This difference seems unlikely to impact the proof of initiality much, since I expect rarely will we be working with more than one particular variable at a time: only a few operators actually bind more than one variable in the same subterm

Insofar as this is true, it seems equally to apply to de Bruijn indices. In fact, at the moment it’s not clear to me that in the course of the initiality proof we will *ever* have to actually write down a particular term containing any variable usage as a subterm: in the clause for the “variable” rule, the variable usage is the term itself, while in all other clauses the subterms are metavariables. So the indices may end up being barely visible at all.

I know that Matt and Dan have already said they would like to use de Bruijn indices, so I don’t have to convince them that this is a good idea. But writing out the reasoning was helpful to get it clear in my own mind, and may help any category theorist lurkers to feel less betrayed if we do go this route — and give them an opportunity to make further arguments in favor of named/CoV syntax that I’ve missed. I’m still turning things over in my own mind, and probably will be for another week or two; but I wanted to share the direction my thoughts are now taking.

]]>You’re right that I was a little too glib. I think what I meant to say is something like that the terms are the full sub-multicategory of that free fat closed cartesian multicategory determined by the generating objects (the sorts). The proof of this is, I think, roughly the adequacy theorem for a logical framework encoding of syntax.

]]>And, perhaps I should also say, the terms/ABTs form the

freefat closed cartesian multicategory having the sorts and operators as generating objects and morphisms, respectively.

Does this actually work as written? If I read your definitions correctly, the objects of the multicategory you construct are the sorts of the signature, but the free closed cartesian multicategory should have new internal hom objects. In particular, any operator that actually binds something should have hom objects in its domain. So its seems to me that we obtain a cartesian multicategory that is not closed, but is instead equipped with operations that implement “composition with binding operators” but they are not represented by composition with actual morphisms.

Does that make sense? If so, what is the universal property that we are really after? It seems that the free closed cartesian multicategory would have a whole lot of new objects resulting from iterating internal homs that are mostly irrelevant to the syntax that we are describing. Maybe we are looking at the free cartesian multicategory equipped with some weaker structure.

]]>Hmm, actually there’s a lot missing there… I didn’t say anything about how such a function would be defined on variables, and so clearly we do need some relation between $fresh$ and $ext$. Getting a bit overenthusiastic…

]]>A more syntactic way of thinking about this would be with a sort of “well-scoped extrinsic logical framework”. Consider a dependent type theory containing as axioms

- a type $vctx$ (the “contexts of variables”, e.g. the objects of $\mathbb{F}$),
- a type family $V:vctx \vdash fresh(V)\, type$, the type of fresh extensions,
- an “extend” or “codomain” operation $V:vctx, x:fresh(V) \vdash ext(V,x):vctx$, and
- a “type of variables” $V:vctx \vdash vars(V) \, type$.

(Although it seems unintuitive, I don’t think we even need any connection between $ext$ and $vars$; but I could be wrong.) Then the raw terms and types can be defined in this DTT as $vctx$-indexed families $V:vctx \vdash tm(V) \, type$ and $V:vctx \vdash ty(V) \, type$ inductively generated by

- $V:vctx, x:vars(V) \vdash var(x) : tm(V)$

and another constructor for each operator, e.g.

- $V:vctx, x:fresh(V), A:ty(V), B:ty(ext(V,x)) \vdash \Pi(x,A,B) : ty(V)$
- $V:vctx, x:fresh(V), A:ty(V), B:ty(ext(V,x)), M:tm(ext(V,x)) \vdash \lambda(x,A,B,M) : tm(V)$.

In a logical framework using HOAS, the type families can’t be actually inductive inside the LF, but here they can be. This DTT has de Bruijn models in $Set$, and named models in $Set$, but also named models in $Gpd$ in which $\alpha$-equivalence is tracked automatically.

Then the judgments and their rules can be defined as further inductive families in this DTT, e.g. $V:vctx, A:ty(V) \vdash validty(V,A)\, type$ and $V:vctx, A:ty(V), M:tm(V) \vdash ofty(V,A,M)\,type$ with constructors like

- $V:vctx, x:fresh(V), A:ty(V), B:ty(ext(V,x)), M:tm(ext(V,x)) \vdash lambdaofpi(V,x,A,B,M) : ofty(V,\Pi(x,A,B),\lambda(x,A,B,M))$

In this version, the $\alpha$-invariance is not automatic (i.e. the dependent polynomial functor in $Gpd$ doesn’t preserve isofibrations) because the indices in the output type of the constructors are not general; I think this is what you’re calling a “Dybjer inductive family”? And we could convert it to a “Martin-Lof inductive family” by including identity-types at the LF level (these being interpreted as $\alpha$-equivalence in the $Gpd$ models).

One reason I mention this is that it gives a way of distinguishing the “functional” judgments with output modes, as being actual (partial) *functions* defined at the LF level, using the recursion principle of the inductive type families $tm$ and $ty$, rather than mere inductively defined relations. So for instance type synthesis could be a function

- $V:vctx, M:tm(V) \vdash tyof(M) : Par(ty(V))$

living in the monad of partial elements. However, I suppose such a function would have to be defined inductive-recursively along with the inductive checking and equality judgments (since the latter, in particular, is not algorithmic in general), which requires rather a lot of this LF.

Moreover, actually interpreting such a *dependently* typed LF into $Set$ or $Gpd$ is, of course, an initiality-theorem problem, so invoking it in our description of raw syntax is a rather chicken-and-egg thing to do. So this is probably not actually helpful in any formal way, but I wanted to mention it because it helped me a bit conceptually. Maybe someone else can make something more useful of it.

In #211, you used a hypothetical lambda rule that’s not from our system. I figured that was just because it made a better example.

Yes, I was following on the example from #172.

I agree there is a mismatch between viewing rules algorithmically and semantically; I don’t quite know how to make sense of the algorithmic ones semantically. Your rule that explicitly composes with $\alpha$-equivalence does indeed seem to be a simpler version of what I had.

]]>Re #214, I need to think some more about that isofibration I got backwards.

I suspect you’re right that your approach based on groupoids of derivations will not handle output positions, or “don’t care” positions specially. More accurately, it doesn’t seem to handle thinking of judgment forms as relations where only some positions are guaranteed to respect $\alpha$.

I’m not sure whether that’s good or bad though. I think my tricks are interesting, and I was hoping to find out if they work. But certainly they aren’t needed. Especially if we think about things the way you seem to like.

Re #215, this is not the kind of pattern matching I had in mind. Your pattern matching incorporates renaming. So it’s some kind of “nominal pattern matching”. It does seem like it handles $\alpha$ invariance once and for all, if you use it, and have rule conclusions use only metavariables.

Some thoughts:

In #211, you used a hypothetical lambda rule that’s not from our system. I figured that was just because it made a better example. Is this lambda rule (in #215) actually the rule you’re proposing? Algorithmically speaking, it’s odd to use pattern matching for the output position ($C$), although I imagine this fits your semantic picture. Also, if I understand your pattern matching correctly, $x$ is not actually the fresh extension used in $N$ or $C$, it’s “nondeterministically chosen”.

Neither of these seem like problems, they’re just at odds with the discussion we’ve been having, thinking about rules in algorithmic terms. If not algorithmically-motivated, why don’t you just explicitly compose with $\alpha$ equivalence?

$\frac{N \sim_id (\lambda(x:A.B).M) \qquad C \sim_id (\Pi(x:A).B) \qquad \Gamma,x:A \vdash M \Leftarrow B}{\Gamma \vdash N \Rightarrow C}$This being the “standard trick” to get Martin-Löf inductive families from Dybjer inductive families using some appropriate equality.

The pattern matching I had in mind would not do any renaming. It would just be ordinary first-order, linear matching, as found in your favorite functional programming language. My motivation was that the hygienic rule descriptions should be able to “compile down” to the kinds of clever, algorithmically-motivated rules we discussed. I think we could still prove the appropriate $\alpha$ invariance properties generically, including the twist for the output mode. But this would probably be more complicated.

Something to keep in mind is how easy it is to work with this generic presentation of rules for the *rest* of the proofs by induction on typing derivations. Since my plan was for the generic rules to “compile down” to something more like what you’d write by hand, I was hoping it wouldn’t get in the way of later proofs. But with a more “interpretive” treatment of the generic definition, your more high-level approach might not be a problem. You’d have some fancier induction principle that hides the mess, somehow. Maybe look at nominal Isabelle for ideas.

One thing I like about your approach is that it seems closer to an explanation of why you can get away with the informal style of name-based rules, which doesn’t fuss so much over metavariables for binding sites.

]]>I’m not sure exactly what you had in mind for patterns, but one thought I had for how to formalize the “allowable conclusions” is to introduce a new judgment form for each operator that essentially pattern-matches against that operator, and require that all rules be rewritable with a totally general conclusion by adding such pattern-matching judgments to the premises. So for instance there would be 5- and 4-place judgments “$M \, islambda \, x \, A \, B \, M$” and “$C \, ispi \, x \, A \, B$” with rules

$\frac{ }{(\lambda (x:A.B).M) \, islambda \, y \, A \, B[x\leftrightarrow y] \, M[x\leftrightarrow y]} \qquad \frac{ }{(\Pi(x:A).B) \, ispi \, y \, A \, B[x\leftrightarrow y]}$and the typing judgment for $\lambda$ would be allowable because it can be rewritten as

$\frac{ N \, islambda \, x \,A \, B\, M \qquad C \, ispi \, x \, A \, B \qquad \Gamma, x:A \vdash M \Leftarrow B }{\Gamma \vdash N \Rightarrow C}$So all the renaming is folded into the pattern-matching judgments, which we can then prove once and for all to be $\alpha$-invariant (this should be doable just once at the abstract level of a set of operators with signatures).

]]>The answer to nearly all those questions is “yes”. The thing I want to be an isofibration is the forgetful functor from a groupoid of derivations to a groupoid of judgments, and yes you could call the morphisms in the former “$\alpha$-equivalences of derivations”.

What is it actually saying to say that $\Gamma \vdash \lambda x.M \Leftarrow \Pi(y:A).B$ is an isofibration? The functor is from a subobject of $Ctx \times_V Tm \times_V Ty$ and extracts the values of the metavariables $\Gamma$, $x$, $y$, $M$, $A$, and $B$?

No, the opposite: the functor is *from* the groupoid whose objects are dependent sextuples $(\Gamma,x,y,M,A,B)$ and it is *to* $Ctx\times_V Tm\times_V Ty$, and it sends such a sextuple to the triple $(\Gamma, \lambda x.M, \Pi(y:A) .B)$. This functor should be an inclusion, so saying that it is an isofibration is saying that the subgroupoid of judgments of this form is a replete subcategory.

I’m not sure there is any sensible way to treat output modes specially in general. If we decide that output-moded arguments should also respect $\alpha$-equivalence (by nondeterministically choosing bound variables), then I think no special treatment should be required.

]]>Re #211, OK, interpreting all that liberally, I think it’s pretty much compatible with what I had in mind. What the pattern/term business would be doing is additionally providing a syntactic check for whether *<something>* is an isofibration.

I have some clarification requests:

The mutually defined judgments are an inductive family indexed over something like $(Ctx\times_V Ty) + (Ctx\times_V Ty \times_V Tm) + \dots$, with one summand for each judgment.

You mean one summand for each judgment *form*, right? So for example, that first one is for $\Gamma \vdash A type$?

Those fibered products are saying that the different subjects of the judgment form use the same scoping context?

So the idea of those types being groupoids is that they’re all fibered over the CoV, their objects are various syntactic structures (for example the objects of $Ctx$ are raw (well-scoped) contexts), and their morphisms are proofs of $\alpha$ equivalence over some bijection of the variables? And the functor to the CoV just keeps the bijection? And the rest of the proof is unique?

Now if the dependent polynomial that defines the indexed W-type preserves isofibrations, then its initial algebra will also be an isofibration, which means that derivations can be transported across $\alpha$-equivalences.

What is it exactly that you want to know is an isofibration? A functor from a groupoid to the CoV? Or from a groupoid of derivations to a groupoid of judgments? (For example $ValidTy \to (Ctx \times_V Ty)$) The latter would be involving $\alpha$ equivalence for derivations??

This preservation of isofibrations should follow roughly from knowing that the conclusions of every rule are isofibrations, i.e. can be transported across $\alpha$-equivalences. So for instance a rule of the form $\Gamma \vdash \lambda x.M \Leftarrow \Pi(x:A).B$ is not an isofibration, because an arbitrary $\alpha$-equivalence could break the identification between the two copies of the variable $x$, whereas a modified version $\Gamma \vdash \lambda x.M \Leftarrow \Pi(y:A).B$ would be.

What is it actually saying to say that $\Gamma \vdash \lambda x.M \Leftarrow \Pi(y:A).B$ is an isofibration? The functor is from a subobject of $Ctx \times_V Tm \times_V Ty$ and extracts the values of the metavariables $\Gamma$, $x$, $y$, $M$, $A$, and $B$? (Sounds like a job for pattern matching.)

How is the output mode handled? Do you do something unclever about the type in a term equality?

]]>Nice!

]]>I’ve been thinking about it in a semantic way. The mutually defined judgments are an inductive family indexed over something like $(Ctx\times_V Ty) + (Ctx\times_V Ty \times_V Tm) + \dots$, with one summand for each judgment. Each of these sets $Ctx, Ty, Tm$ is actually a groupoid, and the inductive type can be defined as an indexed W-type in $Cat$, automatically obtaining a notion of (iso)morphism between derivations indexed over (iso)morphisms between their conclusions. (The forgetful functor from $Cat$ to $Set$ should preserve this W-type, since it preserves both limits and colimits, having both adjoints, and the $\Pi$-type in the indexed W-type is over a map with discrete fibers; so this really does consist of isomorphisms between the derivations we already had.) Now if the dependent polynomial that defines the indexed W-type preserves isofibrations, then its initial algebra will also be an isofibration, which means that derivations can be transported across $\alpha$-equivalences. This preservation of isofibrations should follow roughly from knowing that the conclusions of every rule are isofibrations, i.e. can be transported across $\alpha$-equivalences. So for instance a rule of the form $\Gamma \vdash \lambda x.M \Leftarrow \Pi(x:A).B$ is not an isofibration, because an arbitrary $\alpha$-equivalence could break the identification between the two copies of the variable $x$, whereas a modified version $\Gamma \vdash \lambda x.M \Leftarrow \Pi(y:A).B$ would be.

]]>The main thing I think is missing is some syntactic characterization of rules that are automatically $\alpha$-invariant, so that we wouldn’t have to prove by hand that all of the judgments are such. Anything else?

I think that characterization of rules should be a type whose elements are descriptions of “hygienic” rules. This same notion of rule should not only let us prove $\alpha$-invariance generically, but also the admissible structural rules. I think this notion of rule is still pretty permissive, and that $\alpha$-invariance and the structural rules both have to do with the responsible use of variables.

It will probably be a similar thing to Conor’s patterns/terms for inputs in the conclusion/premises, and terms/patterns for the outputs. I think that’s mainly to get $\alpha$-invariance, which sounds harder than structural rules would’ve been with de Bruijn indexes.

]]>