Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
What if we use this transitivity?
$\frac{\Gamma \vdash N \Leftarrow A \qquad \Gamma \equiv \Gamma \vdash M \equiv N\,:\,A \equiv A \qquad \Gamma \equiv \Gamma' \vdash N \equiv M'\,:\,A \equiv A'} {\Gamma \equiv \Gamma' \vdash M \equiv M'\,:\,A \equiv A'}$(But I still don’t like doubled contexts.)
I agree that if the type synthesis judgments in a set-theoretic presentation don’t respect $\alpha$, then writing down the same judgments in a HoTT presentation will define a “different” relation. But the two relations will define the same partial functor (viewing $\alpha$-equivalences as isomorphisms in a (thin) groupoid of terms – in HoTT this would be just a map of types). Back on #193, I think this is what I had in mind:
be careful to be sure that the checking judgement is admissibly closed under $\alpha$ at the end of the day.
Because the only rule for checking is to switch modes:
$\frac{\Gamma \vdash M \Rightarrow A \qquad \Gamma \vdash A \equiv B \, type}{\Gamma\vdash M \Leftarrow B}$it suffices to show that $\alpha$-equivalence implies judgmental equality, and this should be a straightforward induction, especially with the doubled congruence rules matching the heterogeneous $\alpha$-equivalence. Basically what Matt said in #197, although only for the RHS, since there is no conversion rule for LHSs. Respect for $\alpha$-equivalence on the LHS would I guess be a separate induction over the type synthesis rules (which in a HoTT version would be just $ap$ of the type synthesis function).
With intensional type theory (ITT), it’s impossible to express constructions that don’t respect isomorphism, because there are HoTT models where such constructions are impossible. With a general CoV, there are models, like de Bruijn indexes, where operations that don’t respect $\alpha$ equivalence are impossible, but with the abstraction they come back? Oh, because the operations on terms are not part of the abstraction?
I would say it like this. If we define a general CoV in ITT, it’ll be impossible to express constructions that don’t respect isomorphism, because there are HoTT models where such constructions are impossible. If we define a general CoV in set theory, it is possible to express operations on it that respect $\alpha$-equivalence in some particular CoVs (de Bruijn indices) but not in other particular CoVs (named variables). This is just like in category theory in general: if all you know is that you have an arbitrary category (in a set-theoretic foundation), you can’t define any operation on its objects that always fails to respect isomorphism, because for all you know the category might happen to be skeletal. But you can define operations like “is equal to $a$” which, for some particular categories (namely, non-skeletal ones) will fail to respect isomorphism.
Regarding equality reflection for heterogeneous equality, I would expect that just as the context for a typing judgment should be “fully general”, the context for an equality judgment should be “fully general”, meaning in this case an arbitrary doubled form $\Gamma \equiv \Gamma'$. But I think I should also modify what I said in #184 to add $\Gamma\equiv \Gamma' \vdash A\equiv A'\, type$ as a presupposition for the judgment $\Gamma\equiv \Gamma'\vdash M\equiv M' \,:\, A\equiv A'$. Then I think it makes sense to write equality reflection as
$\frac{\Gamma \vdash P\Leftarrow Eq_A(M,M') }{\Gamma\equiv \Gamma' \vdash M\equiv M' \,:\, A\equiv A'}$where the presuppositions $\Gamma\equiv \Gamma' \,ctx$ and $\Gamma\equiv \Gamma'\vdash A\equiv A' \,type$ mean that the choices of $\Gamma$ and $A$ (as opposed to $\Gamma'$ and $A'$) in the premise are unimportant.
However, despite having suggested totally heterogeneous judgmental equality myself, at the moment I’m not really happy with it. For one thing, it would make the rules of our type theory look significantly less familiar (and significantly more verbose) to the intended audience, which I think is an important consideration. Also I don’t like any of the transitivity options.
Hmm, the thing to be careful about here is: what is the evidence that $x \in V$ for an object $V$ of $\mathbb{F}$? If well-scoped terms include a clause:
- $x$ is a term with free variables $V$ if $x \in V$
then it needs to be the case that the proof of $x \in V$ isn’t itself counted as part of the data of the term (which it would be, in the naive inductive family reading of that), or else the variable is really both the name and the de bruijn index (or other proof that $x \in V$).
What exactly are you worried about here? Are you saying that if the “data” of a “variable-use term” somehow “includes” the de Bruijn index as well as the name, then we aren’t “really” using named variables as opposed to de Bruijn indices?
I don’t really find this worrying. As soon as we have ordered contexts (which we do as soon as we get to the rules for dependent type theory, unless we go Andromeda-style), every variable name uniquely determines a de Bruijn index anyway. It doesn’t seem to me to matter much whether that de Bruijn index is “part of the data” or simply “automatically computable from the data” – in the former case, a user could omit to write it anyway since it can be automatically inferred, so in practice there doesn’t seem much difference.
Also, I think whether the de Bruijn index is really “part of the data” depends on what you mean by “part of the data” and how you present set-membership. In a set-theoretic metatheory, of course, a “proof of $x\in V$” isn’t any kind of “data” at all. In a type-theoretic metatheory, if $V:Var\to Prop$ and fresh extensions are defined by $W(y) = (y=x) \vee V(y)$, then the term witnessing $V(x)$ will indeed syntactically include a sequence of $inr$’s that counts out the de Bruijn index, but it will inhabit a proposition so it’s not clear to me whether that should really be considered “data”. And there are other ways to present it even in type theory, such as $V:Var\to bool$ with $W(y) = if\, (y\stackrel{?}{=}x)\, then\, true\, else\, V(y)$, for which the term witnessing $V(x)$ (i.e. $V(x)=true$) will be just $refl_{true}$.
Re #204:
$\frac{\Gamma \vdash P\Leftarrow Eq_A(M,M') }{\Gamma\equiv \Gamma' \vdash M\equiv M' \,:\, A\equiv A'}$
I thought the $M'$ is used in the wrong scope. You have to rename it. But otherwise yes, that makes sense, and I like it better.
But needing the renaming seems to indicate that we’re actually relying on the fact that it’s possible to work homogeneously. The heterogeneous style seems to make everything except congruence harder, and there was nothing really wrong with homogeneous congruence. So I completely agree that totally heterogeneous equality is a bad choice. (I know, I keep saying that.)
The heterogeneous style seems to make everything except congruence harder, and there was nothing really wrong with homogeneous congruence.
Right. And the fact that we reed a renaming here I think underscores the point: the only thing “wrong” with homogeneous congruence was that we needed a renaming.
Re #203, so I take that as a “yes”.
Oh, because the operations on terms are not part of the abstraction?
…But you can define operations like “is equal to $a$” which, for some particular categories (namely, non-skeletal ones) will fail to respect isomorphism.
Right. Because the operations on objects are functions outside the category.
I think I see now that we can use the style of rule I had in mind for synthesis judgments, and that it may or may not result in a relation that respects $\alpha$, but in any case it should work.
Okay, to summarize what I think we’ve decided, and what I’m thinking about and deciding between right now. I still don’t want to use de Bruijn indices only, so the two options I have in mind are working with a general CoV or with well-scoped Barendregt-named syntax. The main difference between the two in practice is that in the former case, referring to a variable other than the innermost-bound one requires an explicit weakening: $\lambda x. \lambda y. x[y]$ and $\lambda x. \lambda y. \lambda z. x[y][z]$, the number of brackets of course being the de Bruijn index so that we are in a sense formally using both names and indices. This difference seems unlikely to impact the proof of initiality much, since I expect rarely will we be working with more than one particular variable at a time: only a few operators actually bind more than one variable in the same subterm (e.g. $natrec$ binds both the predecessor and the result of the recursive call in its inductive step, and similarly the induction principle for positive $\Sigma$-types binds both projections at once).
With either of these choices, we have the same definition of heterogeneous $\alpha$-equivalence, weakening/renaming, and necessarily-capture-avoiding substitution. Some of the typing and equality judgments involve renaming in the premises, specifically because the “inputs” of their conclusions must be fully general with respect to the variables they bind, but in the premises some of these variables must be identified. It’s possible this renaming might need to involve a universal quantification to make the inductions work (although my inclination would be to try it without that first and see if we run into trouble). This should make these judgments $\alpha$-invariant in these arguments by a straightforward induction.
The type synthesis judgment could make a specific choice about what variables to bind in the “output” argument of its conclusion, in which case it would not be $\alpha$-invariant in that argument. This may not be a problem, although it would mean that the judgment would mean something different if written down in HoTT with a univalent CoV where there is an implicit transport operation along $\alpha$-equivalence. Or type synthesis could “nondeterministically” choose an arbitrary variable (/ fresh extension) to bind in its output, making it $\alpha$-invariant in that argument as well by the same kind of induction. (This “nondeterminism” doesn’t seem too problematic to me; it reminds me of Conor’s rule “post” which says that if a term synthesizes a type, it also synthesizes any reduct of that type. In both cases, a synthesized type is not literally syntactically unique, but is unique up to some stricter notion of sameness than judgmental equality.)
I think I prefer a homogeneous equality judgment, which will have to do some renaming in its premises as well. It treats the terms or types being compared as inputs, and is $\alpha$-invariant in them for the same reasons. The type argument to term equality is kind of “un-moded” and can have its bound variables chosen arbitrarily in the conclusion, because term equality has an explicit conversion rule enabling us to replace the type by any judgmentally equal type, hence a fortiori any $\alpha$-equivalent one. But we could also allow it to choose its bound variables non-deterministically, for consistency.
The main thing I think is missing is some syntactic characterization of rules that are automatically $\alpha$-invariant, so that we wouldn’t have to prove by hand that all of the judgments are such. Anything else?
The main thing I think is missing is some syntactic characterization of rules that are automatically $\alpha$-invariant, so that we wouldn’t have to prove by hand that all of the judgments are such. Anything else?
I think that characterization of rules should be a type whose elements are descriptions of “hygienic” rules. This same notion of rule should not only let us prove $\alpha$-invariance generically, but also the admissible structural rules. I think this notion of rule is still pretty permissive, and that $\alpha$-invariance and the structural rules both have to do with the responsible use of variables.
It will probably be a similar thing to Conor’s patterns/terms for inputs in the conclusion/premises, and terms/patterns for the outputs. I think that’s mainly to get $\alpha$-invariance, which sounds harder than structural rules would’ve been with de Bruijn indexes.
I’ve been thinking about it in a semantic way. The mutually defined judgments are an inductive family indexed over something like $(Ctx\times_V Ty) + (Ctx\times_V Ty \times_V Tm) + \dots$, with one summand for each judgment. Each of these sets $Ctx, Ty, Tm$ is actually a groupoid, and the inductive type can be defined as an indexed W-type in $Cat$, automatically obtaining a notion of (iso)morphism between derivations indexed over (iso)morphisms between their conclusions. (The forgetful functor from $Cat$ to $Set$ should preserve this W-type, since it preserves both limits and colimits, having both adjoints, and the $\Pi$-type in the indexed W-type is over a map with discrete fibers; so this really does consist of isomorphisms between the derivations we already had.) Now if the dependent polynomial that defines the indexed W-type preserves isofibrations, then its initial algebra will also be an isofibration, which means that derivations can be transported across $\alpha$-equivalences. This preservation of isofibrations should follow roughly from knowing that the conclusions of every rule are isofibrations, i.e. can be transported across $\alpha$-equivalences. So for instance a rule of the form $\Gamma \vdash \lambda x.M \Leftarrow \Pi(x:A).B$ is not an isofibration, because an arbitrary $\alpha$-equivalence could break the identification between the two copies of the variable $x$, whereas a modified version $\Gamma \vdash \lambda x.M \Leftarrow \Pi(y:A).B$ would be.
Nice!
Re #211, OK, interpreting all that liberally, I think it’s pretty much compatible with what I had in mind. What the pattern/term business would be doing is additionally providing a syntactic check for whether <something> is an isofibration.
I have some clarification requests:
The mutually defined judgments are an inductive family indexed over something like $(Ctx\times_V Ty) + (Ctx\times_V Ty \times_V Tm) + \dots$, with one summand for each judgment.
You mean one summand for each judgment form, right? So for example, that first one is for $\Gamma \vdash A type$?
Those fibered products are saying that the different subjects of the judgment form use the same scoping context?
So the idea of those types being groupoids is that they’re all fibered over the CoV, their objects are various syntactic structures (for example the objects of $Ctx$ are raw (well-scoped) contexts), and their morphisms are proofs of $\alpha$ equivalence over some bijection of the variables? And the functor to the CoV just keeps the bijection? And the rest of the proof is unique?
Now if the dependent polynomial that defines the indexed W-type preserves isofibrations, then its initial algebra will also be an isofibration, which means that derivations can be transported across $\alpha$-equivalences.
What is it exactly that you want to know is an isofibration? A functor from a groupoid to the CoV? Or from a groupoid of derivations to a groupoid of judgments? (For example $ValidTy \to (Ctx \times_V Ty)$) The latter would be involving $\alpha$ equivalence for derivations??
This preservation of isofibrations should follow roughly from knowing that the conclusions of every rule are isofibrations, i.e. can be transported across $\alpha$-equivalences. So for instance a rule of the form $\Gamma \vdash \lambda x.M \Leftarrow \Pi(x:A).B$ is not an isofibration, because an arbitrary $\alpha$-equivalence could break the identification between the two copies of the variable $x$, whereas a modified version $\Gamma \vdash \lambda x.M \Leftarrow \Pi(y:A).B$ would be.
What is it actually saying to say that $\Gamma \vdash \lambda x.M \Leftarrow \Pi(y:A).B$ is an isofibration? The functor is from a subobject of $Ctx \times_V Tm \times_V Ty$ and extracts the values of the metavariables $\Gamma$, $x$, $y$, $M$, $A$, and $B$? (Sounds like a job for pattern matching.)
How is the output mode handled? Do you do something unclever about the type in a term equality?
The answer to nearly all those questions is “yes”. The thing I want to be an isofibration is the forgetful functor from a groupoid of derivations to a groupoid of judgments, and yes you could call the morphisms in the former “$\alpha$-equivalences of derivations”.
What is it actually saying to say that $\Gamma \vdash \lambda x.M \Leftarrow \Pi(y:A).B$ is an isofibration? The functor is from a subobject of $Ctx \times_V Tm \times_V Ty$ and extracts the values of the metavariables $\Gamma$, $x$, $y$, $M$, $A$, and $B$?
No, the opposite: the functor is from the groupoid whose objects are dependent sextuples $(\Gamma,x,y,M,A,B)$ and it is to $Ctx\times_V Tm\times_V Ty$, and it sends such a sextuple to the triple $(\Gamma, \lambda x.M, \Pi(y:A) .B)$. This functor should be an inclusion, so saying that it is an isofibration is saying that the subgroupoid of judgments of this form is a replete subcategory.
I’m not sure there is any sensible way to treat output modes specially in general. If we decide that output-moded arguments should also respect $\alpha$-equivalence (by nondeterministically choosing bound variables), then I think no special treatment should be required.
I’m not sure exactly what you had in mind for patterns, but one thought I had for how to formalize the “allowable conclusions” is to introduce a new judgment form for each operator that essentially pattern-matches against that operator, and require that all rules be rewritable with a totally general conclusion by adding such pattern-matching judgments to the premises. So for instance there would be 5- and 4-place judgments “$M \, islambda \, x \, A \, B \, M$” and “$C \, ispi \, x \, A \, B$” with rules
$\frac{ }{(\lambda (x:A.B).M) \, islambda \, y \, A \, B[x\leftrightarrow y] \, M[x\leftrightarrow y]} \qquad \frac{ }{(\Pi(x:A).B) \, ispi \, y \, A \, B[x\leftrightarrow y]}$and the typing judgment for $\lambda$ would be allowable because it can be rewritten as
$\frac{ N \, islambda \, x \,A \, B\, M \qquad C \, ispi \, x \, A \, B \qquad \Gamma, x:A \vdash M \Leftarrow B }{\Gamma \vdash N \Rightarrow C}$So all the renaming is folded into the pattern-matching judgments, which we can then prove once and for all to be $\alpha$-invariant (this should be doable just once at the abstract level of a set of operators with signatures).
Re #214, I need to think some more about that isofibration I got backwards.
I suspect you’re right that your approach based on groupoids of derivations will not handle output positions, or “don’t care” positions specially. More accurately, it doesn’t seem to handle thinking of judgment forms as relations where only some positions are guaranteed to respect $\alpha$.
I’m not sure whether that’s good or bad though. I think my tricks are interesting, and I was hoping to find out if they work. But certainly they aren’t needed. Especially if we think about things the way you seem to like.
Re #215, this is not the kind of pattern matching I had in mind. Your pattern matching incorporates renaming. So it’s some kind of “nominal pattern matching”. It does seem like it handles $\alpha$ invariance once and for all, if you use it, and have rule conclusions use only metavariables.
Some thoughts:
In #211, you used a hypothetical lambda rule that’s not from our system. I figured that was just because it made a better example. Is this lambda rule (in #215) actually the rule you’re proposing? Algorithmically speaking, it’s odd to use pattern matching for the output position ($C$), although I imagine this fits your semantic picture. Also, if I understand your pattern matching correctly, $x$ is not actually the fresh extension used in $N$ or $C$, it’s “nondeterministically chosen”.
Neither of these seem like problems, they’re just at odds with the discussion we’ve been having, thinking about rules in algorithmic terms. If not algorithmically-motivated, why don’t you just explicitly compose with $\alpha$ equivalence?
$\frac{N \sim_id (\lambda(x:A.B).M) \qquad C \sim_id (\Pi(x:A).B) \qquad \Gamma,x:A \vdash M \Leftarrow B}{\Gamma \vdash N \Rightarrow C}$This being the “standard trick” to get Martin-Löf inductive families from Dybjer inductive families using some appropriate equality.
The pattern matching I had in mind would not do any renaming. It would just be ordinary first-order, linear matching, as found in your favorite functional programming language. My motivation was that the hygienic rule descriptions should be able to “compile down” to the kinds of clever, algorithmically-motivated rules we discussed. I think we could still prove the appropriate $\alpha$ invariance properties generically, including the twist for the output mode. But this would probably be more complicated.
Something to keep in mind is how easy it is to work with this generic presentation of rules for the rest of the proofs by induction on typing derivations. Since my plan was for the generic rules to “compile down” to something more like what you’d write by hand, I was hoping it wouldn’t get in the way of later proofs. But with a more “interpretive” treatment of the generic definition, your more high-level approach might not be a problem. You’d have some fancier induction principle that hides the mess, somehow. Maybe look at nominal Isabelle for ideas.
One thing I like about your approach is that it seems closer to an explanation of why you can get away with the informal style of name-based rules, which doesn’t fuss so much over metavariables for binding sites.
In #211, you used a hypothetical lambda rule that’s not from our system. I figured that was just because it made a better example.
Yes, I was following on the example from #172.
I agree there is a mismatch between viewing rules algorithmically and semantically; I don’t quite know how to make sense of the algorithmic ones semantically. Your rule that explicitly composes with $\alpha$-equivalence does indeed seem to be a simpler version of what I had.
A more syntactic way of thinking about this would be with a sort of “well-scoped extrinsic logical framework”. Consider a dependent type theory containing as axioms
(Although it seems unintuitive, I don’t think we even need any connection between $ext$ and $vars$; but I could be wrong.) Then the raw terms and types can be defined in this DTT as $vctx$-indexed families $V:vctx \vdash tm(V) \, type$ and $V:vctx \vdash ty(V) \, type$ inductively generated by
and another constructor for each operator, e.g.
In a logical framework using HOAS, the type families can’t be actually inductive inside the LF, but here they can be. This DTT has de Bruijn models in $Set$, and named models in $Set$, but also named models in $Gpd$ in which $\alpha$-equivalence is tracked automatically.
Then the judgments and their rules can be defined as further inductive families in this DTT, e.g. $V:vctx, A:ty(V) \vdash validty(V,A)\, type$ and $V:vctx, A:ty(V), M:tm(V) \vdash ofty(V,A,M)\,type$ with constructors like
In this version, the $\alpha$-invariance is not automatic (i.e. the dependent polynomial functor in $Gpd$ doesn’t preserve isofibrations) because the indices in the output type of the constructors are not general; I think this is what you’re calling a “Dybjer inductive family”? And we could convert it to a “Martin-Lof inductive family” by including identity-types at the LF level (these being interpreted as $\alpha$-equivalence in the $Gpd$ models).
One reason I mention this is that it gives a way of distinguishing the “functional” judgments with output modes, as being actual (partial) functions defined at the LF level, using the recursion principle of the inductive type families $tm$ and $ty$, rather than mere inductively defined relations. So for instance type synthesis could be a function
living in the monad of partial elements. However, I suppose such a function would have to be defined inductive-recursively along with the inductive checking and equality judgments (since the latter, in particular, is not algorithmic in general), which requires rather a lot of this LF.
Moreover, actually interpreting such a dependently typed LF into $Set$ or $Gpd$ is, of course, an initiality-theorem problem, so invoking it in our description of raw syntax is a rather chicken-and-egg thing to do. So this is probably not actually helpful in any formal way, but I wanted to mention it because it helped me a bit conceptually. Maybe someone else can make something more useful of it.
Hmm, actually there’s a lot missing there… I didn’t say anything about how such a function would be defined on variables, and so clearly we do need some relation between $fresh$ and $ext$. Getting a bit overenthusiastic…
And, perhaps I should also say, the terms/ABTs form the free fat closed cartesian multicategory having the sorts and operators as generating objects and morphisms, respectively.
Does this actually work as written? If I read your definitions correctly, the objects of the multicategory you construct are the sorts of the signature, but the free closed cartesian multicategory should have new internal hom objects. In particular, any operator that actually binds something should have hom objects in its domain. So its seems to me that we obtain a cartesian multicategory that is not closed, but is instead equipped with operations that implement “composition with binding operators” but they are not represented by composition with actual morphisms.
Does that make sense? If so, what is the universal property that we are really after? It seems that the free closed cartesian multicategory would have a whole lot of new objects resulting from iterating internal homs that are mostly irrelevant to the syntax that we are describing. Maybe we are looking at the free cartesian multicategory equipped with some weaker structure.
You’re right that I was a little too glib. I think what I meant to say is something like that the terms are the full sub-multicategory of that free fat closed cartesian multicategory determined by the generating objects (the sorts). The proof of this is, I think, roughly the adequacy theorem for a logical framework encoding of syntax.
Okay, brace yourselves – I’m considering changing my mind once again. (No hobgoblins of little minds here…)
After exploring lots of options, it seems that no matter what tack we take, directly proving invariance under $\alpha$-equivalence is going to be either technically intricate, tediously annoying, or both. I’m gradually coming to feel that this difficulty outweighs the advantages I saw in using named or CoV syntax. Moreover, while CoV and related ideas are worth pursuing somewhere, the goal of this project should not be to revolutionize the world of syntax, but to take a geodesic route to an initiality theorem that is reasonably modular, though not striving for excessive generality.
As anyone who’s worked with me before knows, I have a tendency to get carried away with exciting new ideas. This is maybe not a very good attribute in the coordinator of a project like this, whose goal shouldn’t be developing really new technology but only applying the existing technology, with a few tweaks to make it more modern, comprehensible, and modular. And as I’ve said before, those goals suggest choosing presentations of syntax and semantics that are as close as possible to each other, leaving the translations to other presentations to happen purely in one realm or the other.
All of that now inclines me to recant my opposition to de Bruijn indices. As Dan said in #163,
if the idea is to make the two sides of the theorem as close as possible, then it seems like using de Bruijn for this part, and relying on a prior translation from a named representation to de Bruijn, would be acceptable.
My only response to this was that I didn’t want to actually use de Bruijn syntax when writing out the proof (and I didn’t want, and still don’t, to write down named syntax and pretend that it is de Bruijn syntax — if the syntax is actually de Bruijn, we should be honest and write it that way). But after thinking about whether it would be practical to use CoV for the proof, I observed (#209)
This difference seems unlikely to impact the proof of initiality much, since I expect rarely will we be working with more than one particular variable at a time: only a few operators actually bind more than one variable in the same subterm
Insofar as this is true, it seems equally to apply to de Bruijn indices. In fact, at the moment it’s not clear to me that in the course of the initiality proof we will ever have to actually write down a particular term containing any variable usage as a subterm: in the clause for the “variable” rule, the variable usage is the term itself, while in all other clauses the subterms are metavariables. So the indices may end up being barely visible at all.
I know that Matt and Dan have already said they would like to use de Bruijn indices, so I don’t have to convince them that this is a good idea. But writing out the reasoning was helpful to get it clear in my own mind, and may help any category theorist lurkers to feel less betrayed if we do go this route — and give them an opportunity to make further arguments in favor of named/CoV syntax that I’ve missed. I’m still turning things over in my own mind, and probably will be for another week or two; but I wanted to share the direction my thoughts are now taking.
(Warning: Haven’t read #222 yet, but posting anyway.)
Re #217:
I agree there is a mismatch between viewing rules algorithmically and semantically; I don’t quite know how to make sense of the algorithmic ones semantically.
(I’m not really sure why you call this semantics. It’s just reasoning about $\alpha$ equivalence with fancy category theory, right?)
I think you figured out the key idea later on, in #218:
One reason I mention this is that it gives a way of distinguishing the “functional” judgments with output modes, as being actual (partial) functions defined at the LF level, using the recursion principle of the inductive type families $tm$ and $ty$, rather than mere inductively defined relations.
Never mind LF or recursion; as long as your “semantics” is thinking of the judgment form as a partial function, you will avoid the (conjecturally) unnecessary $\alpha$ respect for output positions. It doesn’t matter whether you then “implement” this with recursion or inductive families.
This is what I was trying to say back when I came up with it in #182:
For type synthesizing rules, there seems to be an interesting possibility: Since the type is an output, the rule gets to pick the name of a binder. It doesn’t need to respect $\alpha$ equivalence in the usual sense. The idea is that we convert a function that respects $\alpha$ into a relation, rather than simulating converting a function on equivalence classes into a relation on equivalence classes by some general machinery like nominal logic or quotients. I think this will be easier, if it works.
Your approach from #211 seems more like “general machinery”.
Re #218:
I can’t figure out what you’re getting at with all that code. But from #219, it sounds like it doesn’t work yet.
By the way, if we include explicit $\alpha$ equivalence all over the place, like in #216, I think we could prove a sort of quotient induction principle: Doing induction, you wouldn’t worry about each particular equivalence on each constructor, you’d just prove overall that your motive respects $\alpha$. This is probably the right induction principle for that approach, not something from nominal logic. Got confused. This is not the “usual” sort of quotient induction from QITs, because the derivations themselves aren’t quotiented. Actually, it doesn’t matter whether the derivations are quotiented, since we only need a non-dependent elimination form.
I think this is what you’re calling a “Dybjer inductive family”? And we could convert it to a “Martin-Lof inductive family” by including identity-types at the LF level (these being interpreted as $\alpha$-equivalence in the $Gpd$ models).
Dybjer families are inductive families with only “Protestant” indexes. Martin-Löf families have “Catholic” indexes. That religious terminology is apparently some weird joke. I don’t get it though; ask Dan. Protestant indexes have to be completely general in the conclusion, but can vary in premises. I think Protestant indexes are also called pseudoparameters. In Coq, the syntax is the same as for parameters.
Dybjer families alone cannot define identity types. LF signatures correspond to inductive-inductive families with Catholic indexes though, so I’m not sure what you mean about including identity types in LF. You can just declare identity.
Pseudoparameters are ostensibly perfect for input positions, if the rules really follow the mode discipline. For the pattern matching, you use actual large eliminations. But the large elimination somehow ends up being a pain in the butt, in Coq. (For example, you have to prove the induction principle yourself. And simpl
gets all eager, unless you tell it not to. And the default implicit arguments it chooses is too conservative.)
By the way, if we include explicit $\alpha$ equivalence all over the place, like in #216, I think we could prove a sort of quotient induction principle: Doing induction, you wouldn’t worry about each particular equivalence on each constructor, you’d just prove overall that your motive respects $\alpha$. This is probably the right induction principle for that approach…
Come to think of it, this proposal is basically just spelling out what it means to work with relations with quotiented indexes. Since informal math seems to have quotients, it seems better to work with syntax quotiented by $\alpha$ equivalence than to imitate it in the completely unimaginative way of this proposal.
Should judgement forms be part of the syntax?
So $\Rightarrow, \Leftarrow, \equiv_{Ty}, \equiv_{Tm}, Type$ would all be operators. Their sort could be “judgement”? The sorts of their arguments are easy to work out. That way contexts are simply the bound variables associated with the operators in the binding tree?
Or if that doesn’t work, have a sort of “contexts”, then an operator : which would take a variable and a type and form a “context”. An operator , for concantenating contexts together. Finally the turnstyle could take in our previous judgement and a context and make the (not so basic) kind of judgement?
Just some ideas.
I think that would be very confusing, given the rest of the setup, where we’re not using judgments-as-types to handle syntax and typing.
(I think it would be very smart to use a logical framework, and judgments-as-types, to handle initiality. But Mike seems against it for this project. I’m not sure Mike has considered all the options though. In addition to fully-dependent judgments-as-types based on LF, there’s also judgments-as-propositions based on various stripped down predicate logics. For extrinsic judgments-as-types, you only use the judgments as propositions because nothing depends on derivations; only expressions. Isabelle and λProlog work like that, using HOL. I think Delphin uses a variant of FOL.)
I’m very much in favor of logical frameworks in general; I think that may be the right way to go about proving a general initiality theorem. I’ve thought seriously about doing this with both dependent LF-like approaches (which would presumably depend on an initiality theorem proven in some other way for the dependent LF theory itself) and FOL versions (essentially abstracting the partial-interpretation approach, probably starting by interpreting the raw syntax using an initiality theorem for simply typed HOAS as I mentioned here and then building some kind of logic on top of that). Both of these are on my list of projects to do someday if no one else does them first (and if anyone is interested in working on them together, I’d be open to that too).
The main reason I don’t want to use a logical framework for this project is that this project is not supposed to be doing something new. The goal here is to clear up confusion about the existing Streicher-style methods for initiality proofs (and streamline them a little). Along the way we are generating and bringing up lots of good ideas for how to improve things in the future, and that’s great, but we need to rein in our enthusiasm and stick to the point for now. (I now include my own temporary obsession with “categories of variables”, and resistance to de Bruijn indices, in that comment. It only lasted for so long because since I’m the organizer, no one else had the authority to tell me to shut up about it; instead you all had to wait for me to give up on it on my own.)
From what I am reading, logical frameworks require a dependently typed calculus as a meta language, which means for initiality of dependent type theory it would be silly to use it as a meta language anyway.
I would agree with Mike for this project. But in general, I see a disparity between type theorists on what exactly a judgement is. Is it in the theory or metatheory. I don’t think there is a correct answer, or at least I haven’t found a convincing argument for either other than ease of use.
From what I am reading, logical frameworks require a dependently typed calculus as a meta language…
I tried to explain in my big parenthetical remark in #226 that this is not the case for extrinsic-style presentations of deductive systems.
But in general, I see a disparity between type theorists on what exactly a judgement is. Is it in the theory or metatheory. I don’t think there is a correct answer, or at least I haven’t found a convincing argument for either other than ease of use.
It seems to me that there’s agreement that the primary role of judgments is always to be something external relative to types. To a large extent, type constructors end up reflecting external structure internally, but not everything is reflected. For example, unless you have equality reflection, there’s a difference between the equality judgment form and an equality-like type constructor. There’s almost always a difference between the typing judgment form and any type constructor. If not, you’re in a very special case of type theory known as material set theory. ;) Universes reflect the “is a type” judgment form into a special case of the typing judgment, but unless you like inconsistency, the reflection is incomplete. Function types give you internal Homs. Blah blah blah…
Type theorists evidently disagree about what additional properties judgments should have, beyond the little that’s needed for them to serve their primary role of providing a setting to specify type constructors.
What exactly is meant by constant here? What are rules for constants?
The constants and their rules are a parameter of the whole theorem. The idea is that you assume some dependent functions and equations about them, except they’re operators, not functions, and the equations are judgmental equality. I think.
See also: Initiality Project - Overview#axioms