Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
I thought I explained why it was a good idea for one and not the other. I think it’s a good idea for the typing rules because the information flow matches what happens in the semantics. But it’s not possible to orient all the equality rules in an arbitrary type theory, so it seems the most natural to me to have just one equality judgment. (The difference semantically is that we’re looking (for now) at semantics in 1-categorical structures where equality is a mere property.) I’m unclear what kind of system could have some equality rules directed but not others, do you have something specific in mind?
Discussion about the includes is here.
I think it’s a good idea for the typing rules because the information flow matches what happens in the semantics.
I think I misunderstood you. You mean in the CwF itself? Not just the interpretation? I don’t understand how the CwF itself has information flow.
But it’s not possible to orient all the equality rules in an arbitrary type theory, so it seems the most natural to me to have just one equality judgment. (The difference semantically is that we’re looking (for now) at semantics in 1-categorical structures where equality is a mere property.)
I don’t understand this. How do you know? I didn’t think this had much to do with nature of the model itself.
Come to think of it, I don’t see why we can’t just give all equality rules, besides conversion, checking premises and a synthesizing conclusion, like with typing. I haven’t thought it through, but since our current use of bidirectionality doesn’t really have anything to do with algorithms, I don’t see any warning signs.
For this proof, I assumed the question was what information flow we should use, to make things easiest. So about PER-style rules, I’m thinking that getting rid of the typing judgments altogether makes things easier, whether we can figure out bidirectional equality judgments or not.
Bas Spitters mentioned a paper about semantics of extensional type theory using PER-style rules. (This same paper was also mentioned on the HoTT list at some point.) That’s what I’m talking about.
I’m unclear what kind of system could have some equality rules directed but not others, do you have something specific in mind?
I might know a trick. I’m not sure whether it counts. But I didn’t mean to recommend directing only some of the rules. I meant that because I didn’t know how to direct all the rules, maybe bidirectionality wasn’t such a hot idea to begin with. But now you should think about the way of directing all equality rules that I proposed above.
I mean the interpretation, not the CwF.
I don’t really understand your new proposal. Are you suggesting that we have two equality judgments and , and that the equality rules look like for instance
? That doesn’t seem to work as written, since if is an output on the bottom then where do you get it from in order to send it as an input to the premises? I suppose you could choose one of the terms to synthesize a type for in order to compute :
but that seems like needless work when equality only gets “called” by the typechecking modeswitching judgment in “checking mode” where we already know the types. Also there is of course no way at all in transitivity to “compute” , it has to be guessed. So I suppose could similarly be guessed. But I don’t see any advantage to this; it seems like making everything more verbose to no purpose. The difference is that when defining the partial interpretation, there is a difference between a function that outputs a type and a function that inputs a type, but in calculating an equality there is no difference whether the type is an “input” or an “output”. Put differently, the “mode-switching rule for equality”
is semantically trivial: the equalizer of two maps into is the same regardless of whether we know in advance that the two maps agree in .
I don’t like the idea of using PER-style rules, if for no other reason than that they won’t be very comprehensible to mathematicians. I want to keep the input type theory as close as possible to the style used in the HoTT Book, for instance. (Which is certainly a disadvantage of any bidirectionality, but directing just the typing rules is so pleasingly parallel to the partial interpretation clauses, and so little different from the undirected style, that I still think it’s worth it. I don’t see why an inability to direct the equality rules would argue against directing the type-checking rules.)
I mean the interpretation, not the CwF.
So then I don’t understand what it means to say that the current rules seem closest to the interpretation. We haven’t done the interpretation yet. The way we should do it will depend on the rules.
I don’t really understand your new proposal. Are you suggesting that we have two equality judgments and …
Yes, two equality judgments for elements. My proposal might be flawed though. The transitivity rule I was thinking of was:
doesn’t need to come from anywhere besides the derivation itself. We know the judgments aren’t algorithmic. The business about needing to guess values of metavariables is a way to explain why not, but we knew it. The directions still correspond to flow of validity. That’s why I need the .
But this transitivity rule may be too strong. I would need to wrap my head around what it’s really saying, in our situation.
So I suppose could similarly be guessed. But I don’t see any advantage to this; it seems like making everything more verbose to no purpose.
Huh? What’s verbose? You mean having an extra judgment form? Or the ?
Put differently, the “mode-switching rule for equality” […] is semantically trivial: the equalizer of two maps into is the same regardless of whether we know in advance that the two maps agree in .
In the CwF, yes. For the totality proof, for example, the rule may not be quite so trivial, depending on the induction motive. In the PER way of thinking, that mode-switching rule is a generalization of the non-equality one.
I don’t like the idea of using PER-style rules, if for no other reason than that they won’t be very comprehensible to mathematicians.
Oh, that’s too bad. In that case, maybe I should forget about it after all. My reason to believe that the equality judgments ought to be directed was that you want two partial interpretation functions, and the congruence rules are generalizations of the typing rules, in PER style.
But wouldn’t it be nice to cut the number of cases almost in half?
Which is certainly a disadvantage of any bidirectionality, but directing just the typing rules is so pleasingly parallel to the partial interpretation clauses, and so little different from the undirected style, that I still think it’s worth it.
This too, I don’t understand. You chose to have two partial interpretation functions (for elements), corresponding to bidirectionality. You could’ve just had one. And then you wouldn’t be able to say that bidirectional judgments are pleasingly parallel.
There must be some other reason for both of them, right? But you are not acknowledging it at all, in this thread, for some reason.
I don’t see why an inability to direct the equality rules would argue against directing the type-checking rules.
It’s not like directing only the typing rules is broken. It’s just that I figured that the minor difficulties we’re trying to avoid by using bidirectional typing would just come up anyway with unidirectional equality.
We haven’t completed the interpretation yet, but we have a proposal for its general outline, from Peter’s quoted email in the original nForum discussion, which I elaborated and tweaked in Initiality Project - Overview and have now further refined at Initiality Project - Partial Interpretation and Initiality Project - Totality. In particular, Peter’s proposal to take the “environment” of variables as an input to the partial interpretation function, rather than an output, is suggested not by an a priori desire to analogously make context validity an input to the typing judgments, but rather by a desire to make the recursion in the partial interpretation function a structural recursion over raw terms, rather than a less straightforward induction on an auxiliary “height” function as done by Streicher. I’m saying that this independently motivated choice about the interpretation then suggests that it would be better to make choices about the syntax that correspond to it as closely as possible.
Similarly, while having two partial interpretation functions is indeed technically a choice, my claim is that if you look closely at what’s happening in the interpretation (even of a unidirectional theory), the two functions are already there implicitly. I wrote about this back here. I’m saying that since these two “stages” both have to be there in the interpretation anyaway, defining the interpretation will be easier to understand if we separate them into two different judgments on the syntactic side too. Among other things, that should reduce the redundancy in the inductive clauses, since otherwise we would essentially have to inline the definition of the “checking” interpretation for every premise of every rule that in a bidirectional theory would be a checked judgment.
In contrast, having two judgments for equality seems like it would unnecessarily duplicate work in proving the interpretation, since on the semantic side both are interpreted by the same thing.
Finally, I’m not convinced that PER rules would reduce the number of cases, because semantically, defining objects is different from proving that they are equal.
Oh, hang on, I need to think some more about PER judgments. I was assuming that in that case the partial interpretation functions would also have to be PER-ish, but maybe that’s not right: maybe we still have the same three partial interpretation functions I wrote down before, but then there are fewer things to prove by mutual induction in the totality theorem: the statement corresponding to a PER judgment (for instance) would be that the interpretations of and of are both defined and equal on the interpretation of . Maybe that would noticeably reduce the number of cases, in which case it may be worth considering (and therefore also having to parallelly orient or non-orient both the typing and equality judgments). Let me think some more.
Right, unless I’m way off track, the partial interpretation functions don’t look at derivations at all.
Also, based on your edits to the page, I think we’re thinking completely different things about what we’re supposed to be getting out of bidirectionality. It’s not nice because undirected conversion makes typing derivations non-unique. We’re using the partial interpretation functions. The judgments should be understood as mere propositional. Non-unique derivations doesn’t matter at all.
Perhaps that whole first section on bidirectionality should be scrapped and replaced with the real motivation from the semantic/interpretation side as in #12.
Re #15, why do you say the usual rules treat it as an output?
Regarding the fixme, it depends on how we formulate reflexivity, doesn’t it? If the reflexivity rule is
then we don’t need anything to derive it. This rule makes sense if and are presuppositions of the equality judgment , meaning I guess that the equality judgment is treated as checking. And if equality is bidirectional, then it should be the equality checking rule that appears in the mode-switching rule. And this reflexivity rule is also valid semantically, i.e. the partial interpretation of is always equal to itself on any domain.
Conversely, although I can see that it makes sense to regard a judgment that is only derivable by one rule as being “defined” by that rule, but isn’t it still technically an inversion lemma to conclude that any derivation of that judgment must be obtained from that rule?
I guess someone should fix the Bidirectionality section eventually. But I’m more interested to find out the final rules.
For these fully annotated terms, maybe the unidirectional rules can actually be interpreted as going in either direction. (Haven’t checked.) But for the usual annotations for unidirectional intensional type theory, it seems nuts not to consider the type validity an output of the equality rules, like with the typing rules.
Why?
Regarding the fixme, it depends on how we formulate reflexivity, doesn’t it?
Aren’t you thinking of the wrong equality judgment form? I’m pretty sure
doesn’t fly.
Why not? would be an analogous presupposition for it.
But for the usual annotations for unidirectional intensional type theory, it seems nuts not to consider the type validity an output of the equality rules, like with the typing rules.
Why?
Good question. Maybe it isn’t.
Wow, OK. I guess we could try , if we don’t switch to PER-style. Interesting. But we do need to hold on to some way to say a type expression makes sense.
but isn’t it still technically an inversion lemma to conclude that any derivation of that judgment must be obtained from that rule?
No, I personally wouldn’t call it one. You can change it back if you like. In Coq, you’d just destruct
the derivation and get what you want. No lemma.
Right, of course in PER style there would be no general reflexivity rule at all, instead the formation and congruence rules would combine to derive reflexivity for any valid type expression. But in non-PER style, we have the separate judgment , so reflexivity doesn’t need to play that role.
I see the point about inversion, I guess. It’s called an inversion lemma when after destructing the derivation you still have to rule out spurious cases by applying injectivity of constructors?
…But in non-PER style…reflexivity doesn’t need to play that role.
Yes, I understand now. I think. I had just never thought of that before.
It’s called an inversion lemma when…
I don’t know any official rules, but basically yeah. I would expect it to rule out cases based on the form of indexes.
RIght now I can’t decide whether PER-style rules would actually save us work overall.
On the one hand, PER-style rules would eliminate all the congruence rules of equality and their associated inductive clauses in the totality theorem. But this is not actually that big of a savings, because those inductive clauses barely deserve to be called work: each of them is simply the observation that judgmental equality is interpreted by semantic equality, and all the semantic operations respect equality because it’s, well, equality. Maybe the whole initiality theorem is just bookkeeping, but it seems like these rules are really really just bookkeeping. I think (though I’m open to being proven wrong) in a published mathematical paper anyone ought to be content with proving these by “similarly”, and in a formalization they should all be blasted by a simple tactic.
On the other hand, PER-style rules do technically also involve us in additional work. Consider for instance -reduction: . If the part of the totality theorem corresponding to a PER equality judgment is that both sides are defined and equal on the interpretation of , then in the clause associated to -reduction we have to prove, not only that the interpretations of and are equal, but that each of them is defined. In the latter case this is an instance of the substitution lemma, while in the former case it should be just putting together the inductive clauses for and ; but technically, we can’t just say “by the clauses for and ” unless we factor each of those clauses out as a lemma to prove before beginning the entire totality theorem induction. Am I right?
So, it’s not clear to me that PER-style would actually save us noticeable effort, even if it works. And given that we’re not sure whether it works, and that it would look unfamiliar to a substantial amount of the audience, I think I’m back to being inclined against it.
PER-style rules would eliminate all the congruence rules of equality and their associated inductive clauses in the totality theorem.
Just to check: PER-style rules would keep all the congruence rules and get rid of the associated typing rules. That means all the typing rules, and also the type formation rules. Those judgments are gone. (Well actually, they’re now defined as reflexivity instances of equality judgments.)
And this removes clauses from all inductions on derivations. I figured there’d be at least two of those, with totality being the first.
Maybe the whole initiality theorem is just bookkeeping, but it seems like these rules are really really just bookkeeping.
My guess is that all the rules will be mostly bookkeeping. Unfolding the definitions of things. The size of the proof will be very well correlated with the complexity of the formal judgments’ definition. (Number of rules, but also number of term formers in rules.) So maybe “similarly”, or an actual tactic would blow everything away. I dunno. That’s an argument to do whatever’s most intuitive.
Consider for instance -reduction…
We need the substitution lemma anyway, somewhere, for the type of application. I’d be amazed if using it there too isn’t really easy. You’re right of course that we can’t formally say “by the clauses” if they’re inlined into the induction. But to prove that they’re equal, it seems we need to appeal to cases of the partial interpretation anyway.
Now seems like a good time to point out that in my interpreter, the partial interpretation was in two steps: an inductive (!) interpretation relation, then a uniqueness proof. (Well, also an inversion lemma. Pick your poison.) So if you already have the ingredients, proving totality is the one step of applying an interpretation rule. So I don’t think we’d need another set of lemmas to help prove totality of nested term formers in rules.
By the way, many of my totality clauses didn’t need uniqueness of interpretations. That got used for (implicit) direction switches. If we can stick to bidirectional judgments, we can probably be even simpler. But Peter’s idea to use a partial object monad is cool too. So maybe that doesn’t matter. But I have no experience with what direction switches make you do in the later proof steps.
I think I’m back to being inclined against it.
OK.
Yes yes, technically PER-style would eliminate the typing rules and keep the congruence rules, but what I meant was that all the work that was happening on the typing rules would then get folded into the congruence rules, so it’s really the extra work of the congruence rules that would get eliminated.
My guess is that all the rules will be mostly bookkeeping. Unfolding the definitions of things.
Yes, that’s what I meant: all the rules will be mostly bookkeeping, but the congruence rules will be literally nothing but bookkeeping.
It’s possible that one could write a sufficiently smart tactic to blow everything away in a formalization. But if “similarly” were acceptable to deal with most cases in a paper proof, there would be no need for this project. (Some people may think that it should be acceptable, but it’s a fact that it’s not acceptable to everyone, and the point of this project is to get a paper proof written that should be acceptable to all mathematicans.)
…so it’s really the extra work of the congruence rules that would get eliminated.
That still seems like a weird way to think about it. In my estimate, there’s no noticeable difference in triviality between totality cases that actually show totality, and congruence cases that avoid it, once we get the definitions and lemma statements right. But this is silly, we should not waste time arguing about how easy the easy parts are.
But if “similarly” were acceptable to deal with most cases in a paper proof, there would be no need for this project.
You know, I was thinking that. But you brought “similarly” up, and I didn’t want to sound like a wise guy. :)
OK, so I would like to confirm the pending decisions that go into writing the rest of the rules.
You (Mike) want to stick to non-PER rules, and I’m not arguing.
I think we should stick to unidirectional element equality for now. I’m guessing that when it comes to using it, it’s going to feel slightly clunky. Maybe you guys will be able to see a way to do it bidirectionally at that point. I expect that if you can get rules working for the totality proof, they’ll also work for later inductions. So if you think of bidirectional rules that work for totality, I recommend you switch to them.
I think you still want the unidirectional equality judgments to take all validity as inputs. The question is: are you going to try to take advantage of that in the rules, like with that
rule?
When proving totality, as for the induction motive for element equality, it will refer to the term checking interpretation, and assume it’s defined?
I wanted to have my hand at writing down the equality rules but realised that I had overlooked quite some details about the design choices. Let me try to reformulate the part about the flow for conversion rules to see if I understood something.
If I look at the mode-switching rule
then starting from the conclusion I have as hypothesis that : 1) is well-fomed (as an input), 2) is a well fomed type in , 3) and is (syntactically ?) well formed term in .
From the premise , 1) & 2) , I should be able to conclude either an “error” that should be forwarded to the conclusion or that 4) is a well-formed type in .
Now, it means that can actually take 1), 3) & 4) as input and output a “truth” value forwarded to the conclusion.
Does that make sense ? Is that how we intend to write the equality rules ?
However I don’t see how to provide a reasonable interpretation for transitivity if written as follows :
Going bottom-up, we have that , and are well-formed but we need to ensure that is well-formed as well. It would work if we take the more verbose
Another option is of course to have a skewed/arbitrarily orientated flow for conversion : could take as input that is well-formed and is a well-formed type in and output that is a well-formed type in
@kyod thanks for starting to work on this! But I think we need to be careful about the phrase “well-formed”. Are raw terms “well-formed” or do they only become so when a judgment about them is derivable?
I would say the presuppositions of are:
The only presupposition of is that is valid; is a raw term, and such that is an output. Thus, in the mode-switching judgment, (1) of the conclusion yields the presupposition of the first premise, at which point we have both and which should be the presuppositions of the equality judgment .
Thus, I agree with you that in transitivity, should be a premise. I think this is actually necessary semantically as well: the premise ensures that the interpretations of and , which are partial functions, agree when their domains of definition are intersected with each other and with the interpretation of , and similary for . But if for instance the interpretation of has empty domain, this doesn’t ensure that the interpretations of and agree on the intersections of their domains with the interpretation of ; we need the additional assumption that the interpretation of is defined on the whole interpretation of .
@Matt: Yes, I think we should try being as consistent as possible about using the presuppositions, including that premise-free reflexivity rule.
When proving totality, as for the induction motive for element equality, it will refer to the term checking interpretation, and assume it’s defined?
I don’t understand this question, can you explain it further?
Never mind; I have fallen behind in reading your page revisions; the question I should’ve asked is already answered.
I have a different question: Would you also write definitions in the internal extensional type theory of the presheaf category, when this avoids categorical definitions, please? That would make it faster for me to understand what’s going on. It might also be a helpful Rosetta stone for me and others, to get better at translating back and forth.
I think this is actually necessary semantically as well: the premise ensures that the interpretations of and , which are partial functions, agree when their domains of definition are intersected with each other and with the interpretation of , and similary for .
I don’t understand this. From point 4 of the Totality page, it sounds like if ⟦A⟧ is defined on an environment in ⟦Γ⟧, then ⟦B⟧ has to be as well. (And vice versa.) So I don’t see why that premise is needed, for totality. (But it still might be a good idea to have it.)
Re #39, I do think the term equality rules should assume that, in parallel to how the type equality rules assume . I changed them.
Added weakening, contraction, permutation and substitution as admissible rules.
I doubt strenghening holds because of the transitivity case in equality (and it definitely wouldn’t hold in presence of equality reflection).
I think the admissible rules should be stated with a bunch of well-formedness premises. Take weakening, for example. The normal flow of information for the rule (if I understand correctly from the discussion that you’re using a convention of presupposing context well-formedness) would be: the conclusion presupposes G,x:A,G’ ok, so we get to assume that. But then the premise requires G,G’ ok, which isn’t necessarily a consequence of the presupposition of the conclusion. So you have to write the theorem as
If G ok and G |- G’ ok and G |- A type and G,G’ |- J then G,x:A,G’ |- J or maybe If G,x:A ok and G |- G’ ok and G,G’ |- J then G,x:A,G’ |- J (if you’re OK with invoking some inversion on context formation, to conclude G ok from G,x:A ok)
but not
If G,x:A,G’ ok and G,G’ |- J then G,x:A,G’ |- J (because the presupposition of G,G’ ok doesn’t follow from G,x:A,G’ ok, because x:A might be relevant to why G’ is well-formed, even if x doesn’t occur in G – unless you have a strengthening lemma, but I agree with your comment above that that’s either hard or impossible to prove, and should be avoided).
Also, depending on how formal you want to be, technically G’ in these rules is a telescope (a context relative to another context, rather than a context outright), and you need to define or derive telescope formation rules for it (e.g. you could define telescope formation G |- G’ tele as G,G’ ok, but then that has a different presupposition flow from the rest of the judgements…).
Following @dlicata advice, introduced the derived notion of a valid telescope and added a premise to weakening (even though I’m coming from a proof theory/PL background, the bidirectional interpretation is quite non-standard to me).
@atmacen the is actually useful in some extensions (equality reflection…) : you could have and but not being able to derive .
I introduced the notion of telescope but to my understanding the only diffrence with a context is that 1) it is usually open 2) you add elements to contexts on the right and to telescope on the left. I am not sure if in the current context it actually makes sense.
What is the difference between a premise and a presupposition? When we carry out inductive arguments over judgement derivations, presumably we need to verify both premises and presuppositions, so why is it a useful distinction?
(Sorry for asking dumb questions again. I don’t think that I have anything to contribute at this stage, but since Mike expressed desire to make the project accessible to mathematicians, just think of me as a test case of a mathematician who would really like to understand this, but has no experience with nuts and bolts of type theory.)
Thanks, Dan and Kenji. I was worried we needed more assumptions about contexts for the admissible rules. And yes, I was wrong to switch exchange to use only a free variable condition. Sorry about that.
I think we should take Dan’s advice and write the admissible rules in semi-English, since they’re theorem statements.
2) you add elements to contexts on the right and to telescope on the left.
That doesn’t sound right. I think you need to add to the right end of a telescope too. Going to add a judgment.
I’m not sure if it makes sense to treat context formation as a special case of telescope formation: the presupposition of should be that , so then the rule that presupposes itself.
(Also, exchange, contraction and substitution need analogous premises to the ones @kyod added for weakening, if anyone has time to do that.)
Re presuppositions: there’s some discussion in #68 of the discussion of plans but here’s a stab at a simple example. Suppose you’re inductively defining a predicate Foo(n : Nat) by some inference rules (or think of an inductive family / inductive datatype in Coq or Agda if you’ve seen those). Now suppose that for some reason you want n to always be non-zero. You could either
(1) Put sufficient premises on every inference rule such that Foo(n) implies n != 0. This might not mean annotating every inference rule with a premise; for example,
“Foo(1) holds” and “if Foo(n) holds then Foo(2 * n) holds” would have this property, because it’s true in the base case, and inductively if it’s true for n then it’s true for 2n.
This is what I would call adding “premises” to the definition.
(2) Instead define a predicate Foo({n : Nat | n != 0}), i.e. in order to even say Foo(n), you already need to know that n is non-zero. This is what I’d call a presupposition of a judgement.
(1) vs (2) changes “who” checks that n is non-zero. In (1), I think of it as checking that n is non-zero as you construct the derivation of Foo. In (2), I think of it as whoever asks you to construct a derivation of Foo(n) having the obligation to already have checked that n is non-zero.
Dan, for semantics, I think you’re right about needing contexts strictly before telescopes. But for judgments on raw syntax, I’m pretty sure I get away with it. I don’t object to someone changing that, though. I was following Kenji, who had also defined valid contexts as valid telescopes in an empty (raw) context.
The proof of is immediate from the rules, and this then gives . I don’t think there’s a problem with the bootstrapping, syntactically. (But maybe clever syntactic bootstrapping is considered harmful.)
Could we save ourselves duplication of work and avoid dealing with telescopes by proving a general context substitution lemma, covering weakening, exchange, and one-variable substitution all at once?
The link in #54 above should be this (unfortunately there is no nice syntax for linking between nForum discussions, you have to use the full URL).
What is the difference between a premise and a presupposition?
Dan already described the difference in #54, although he omitted the option (3) that’s the one I’m currently proposing to actually use here:
(3) Allow the rules to apply to Foo(n) even when n might be 0, but then just ignore what happens when n=0 and only apply Foo to nonzero numbers. To make this make sense, you have to ensure that if the subject of Foo in the conclusion of a rule is nonzero, then so are all the subjects of Foo in the premises (otherwise talking about Foo(n) for nonzero n might involve you in talking about Foo(0) unexpectedly).
When we carry out inductive arguments over judgement derivations, presumably we need to verify both premises and presuppositions, so why is it a useful distinction?
Actually, we don’t need to verify presuppositions! At least, not as part of the induction. That’s the whole point. The premises are the only actual “inputs” to the inductive constructors, so we only need to include inductive clauses to deal with them. We might afterwards have to verify the presuppositions, but not as part of the induction.
Huh, my use of “presupposition” is different from Dan’s. I consider it a presupposition whenever you only want to worry about Foo(n) when n is nonzero. “Foo(n) presupposes that n is nonzero.” Dan’s (1) and Mike’s (3) correspond to what I’ve been calling validity output and validity input, respectively. Along with Dan’s (2), these are three different ways of implementing what I call “presuppositions”.
To try and explain the difference between Dan’s (2) and Mike’s (3): You are allowed to work with derivations of judgments with (3), even if you don’t already know the validity. You just can’t do anything really useful with them until you know some kind of validity. Dan’s (2) is to make the presupposition part of the type of the judgment. This is probably what you want for semantics, but is perhaps not so useful for syntax.
A subtle thing about formal derivations with validity inputs (that is, about (3)) is that when you do induction on them, you tend to need, not formal validity, but some corresponding “semantic” validity, tailored for the target (“motive”) of the induction. Having formal validity too doesn’t hurt, of course. Intuitively, formal derivations with validity inputs are raw proofs, whose meaning assumes validity, although they themselves don’t.
I keep saying “validity”, because those are the presuppositions for our judgments. But it could be anything you consider a presupposition. (In my sense.)
Regarding telescopes, I don’t think we literally need them, even if we stick to small substitutions. Telescopes let you express weaker premises of the substitution principles, which makes the rules stronger. But you could probably just assume stronger premises using only context validity, and still be able to use the principles when you need to.
Another, weirder option would be to use higher-order admissible rules, with premises like ( —> ). I’ve never tried that, but it seems plausible.
I definitely would not call validity a “presupposition” in (1). The prefix “pre” means “before”, whereas in (1) nothing about nonzeroness of n has to be known or said before the judgment; instead in that case I would say “Foo(n) implies that n is nonzero”.
The reason I use “presupposition” that way is because the only thing that truly corresponds to presupposition is (2). In linguistics, a presupposition is something that must be true for the statement to be meaningful. This seems to clearly correspond to a subset constraint on the type of something in the statement. But until everyone is using type theory with true subset types (not sigmas with truncated second components), it’s also useful to imitate presuppositions with (1) and (3). You have previously referred to (3) as a “precondition”, which I’m basically OK with. But not all preconditions are imitating presuppositions, and true presuppositions are certainly not preconditions.
Since in (3) the intuitive meaning of the judgment is “if is valid, then is a valid type in context ” then maybe (3) should be called something like a “hypothesis”?
I think that would clash pretty badly with actual hypotheses when reasoning about derivations.
More about presuppositions later, but re #56: if you try to define renaming and substitution all at once, the induction doesn’t work (or gets very complicated) because in the process of substitution, you need to do weakening (when you push the substitution under binders). So the usual strategy is to define “total renamings” (variable-for-variable substitution between the whole contexts, which includes weakening, exchange, contraction), and then substitution after that. This paper does things that way, if I recall correctly.
The other option that works is to prove (single-variable) weakening first (easy because it doesn’t change the terms), and then prove single variable substitution for something in the middle of the context. In this style, I think you do formally need telescopes, because to talk about the “middle” of the context, you want to say , where . I don’t see at as a matter of the strength of the assumptions—it’s that to state the theorem I need to talk about a .
In this style, I think you do formally need telescopes, because to talk about the “middle” of the context, you want to say , where . I don’t see at as a matter of the strength of the assumptions—it’s that to state the theorem I need to talk about a .
To clarify the point about assumption strength, I will give an example. The current weakening rule on the page is (with “tele” punctuation added):
My point was that I think you could use:
The assumption seems stronger than needed, but because context validity is a presupposition, we actually expect to know it whenever we need to use the rule. The point would be to avoid ever talking about validity of telescopes. Raw telescopes are raw contexts, and are thus handled already.
I don’t have an opinion about any of this; I think substitutions are always gross.
A ramble about context morphisms vs. small substitutions:
I’m very used to doing small substitutions with de Bruijn indexes. I’ve lost track of how many formalizations I’ve done that way. When I did the formalized interpreter, I wanted to keep doing that, but it turned out that I needed to think in terms of context morphisms for semantic contexts.
So I made a library implementing all the context morphisms corresponding to my favored small substitutions. I was trying to make the semantics work like the syntax. I would say that this part of the development went well. (And it involved semantic telescopes!)
It seems like Mike wants to go the other way, and make the syntax work like the semantics, by implementing context morphisms for raw expressions and derivations. I think you could do this, but the reason we need substitutions in the first place is that the derivation rules use them. Most of them are pretty small. The biggest one is for the variable rule, where you need to weaken a type from the context all the way out of it. (That one was a pain in the butt for me.)
So I don’t know how nicely things go, trying to use context morphisms everywhere. Wouldn’t it be funny if we end up defining small substitutions in terms of context morphisms in terms of small substitutions?
I don’t completely understand what you and Dan are talking about, but I’m confused by #67, because a valid telescope is not a valid context in its own right. The presupposition that does not imply that (or that ), because will probably involve free variables from .
Let me ask this question: is it possible to formulate dependent type theory, including the usual admissible rules it satisfies, without having formal judgments for context or telescope validity? I always thought that it was, and that when such judgments were introduced it was a choice made for a particular reason; but after this discussion I’m not sure any more.
Re #70: No, I’m pretty sure you need at least context validity for the substitution principles.
Maybe I made things all confusing by defining valid telescopes before valid contexts. Usually you don’t do that, and there are basically only two versions of context validity, corresponding to whether or not it’s an input to type validity. (Edit: No; actually whether it’s an output. Not sure if it’s ever neither.)
So the usual strategy is to define “total renamings” (variable-for-variable substitution between the whole contexts, which includes weakening, exchange, contraction), and then substitution after that.
Ah, so a syntactic version of what is written right now (v1) at Initiality Project - Substitution. This is what I’d incline towards doing in the syntax too, then; I’d like to avoid messing around with telescopes if we can.
In category theory, I’ve found that doing things in an unbiased way often makes things simpler, and at least sometimes that seems to have advantages in syntax too (e.g. the comparison between Theorems 2.3.5 and 2.3.8 in my categorical logic notes).
Re #69: The “tele” version works because we receive from the conclusion that , which means that . (At least syntactically. Does this break the semantics?) Combining that with the premise, we get .
Is it possible that with approach (1) you don’t need a context validity judgment? If implies , then we can state weakening for instance as
and the validity of together with should imply the validity of .
Oh, sorry, I misread “” as “ and ” when it should have been . My bad.
That weakening rule looks right. With approach (1), context validity is usually defined mutually with the typing derivations, as we discussed before. But it came up that Barendregt got clever and avoided them completely. I didn’t think that counted.
I’m not sure why you’re worried about context validity all of a sudden.
Well, as I said in #70, I always thought that having a judgment for context validity wasn’t actually necessary, so I want to know whether I’ve been misled all this time! And practically speaking, I think the fewer judgments we have, the simpler the induction should be.
Re 67: I think I understand and mostly agree. The two things I’d say are:
(1) Personally, I always prefer having more rules with clear invariants (even if this means having some “obvious” duplication) to doing subtle things, because (especially in an informal proof) if you have good, consistent invariants then it’s more likely that you won’t screw things up in a way that you don’t notice. I don’t really agree that raw telescopes are raw contexts (unless raw is super-raw and doesn’t even keep track of free variables) — raw contexts don’t have free variables, whereas raw telescopes are relative to a context of free variables. So I think of the as a different kind of thing than , and in that case there should be a formation judgement for that kind of thing.
(2) The difference between the top rule and the bottom rule is that the bottom rule re-checks , whereas with the obvious inductive definition of you don’t, which matches the convention we’re using elsewhere.
Re #77 and previous: My understanding is that in (1), a rule like
needs a context formation premise, so it has to be mutual with the other judgements. Or, if you want to push it further and say that should imply , then
needs it. How do you get away without this?
BTW, just as a general comment, I definitely think the Peter strategy (I think Guillaume was also doing things this way when he talked about initiality in Pittsburgh), and formulating the type theory to match it, is the right thing to try first. Proofs of syntactic properties are generally easier if you follow the natural flow of information of things through the rules.
Re: #79 That makes sense; I guess I’ve been wrong all this time. But it seems like (3) avoids a context validity judgment in the primitive rules, although we need to define it and incorporate it before getting to the admissible ones?
Re #80, just to be clear, by this you mean (2) or (3)? Do you still prefer (2)?
Re #59: Terminology aside, I mostly agree with this, especially the point about (3) needing semantic validity assumptions for most theorems (note sure about the “is perhaps not so useful for syntax” :-) ). Terminology-wise, I would say “invariant” for any of (1) (2) (3) (“it’s an invariant of the judgement that when G |- a : A, G ok”), and I liked “postcondition” for (1); and maybe “intrinsic presupposition” for (2), since it’s part of the (meta)-type; and “extrinsic presupposition” for (3), because it’s external. I think “presupposition” has been used in confusing ways in the type theory literature, which maybe include (1).
But I’m not sure whether we’re all on completely the same page about (3) at a technical level. For concreteness, let’s say that Foo(n) is supposed to mean that n <= 7, but have the presupposition that n is nonzero. So the “inference rules” (assuming a <= function valued in bool) are
data Foo(n: Nat) where foo : (n : Nat) -> ((n <= 7) == true) -> Foo n
So Foo(0) is true, but for whatever reason we want it to be illegal. (I’m thinking of this as a proxy for when is ill-formed).
Then defining and using that as the assumption of theorems involving Foo seems like the right way to work with (3) in the style of (2) (the motivation being avoiding induction-induction):
If RealFoo(n) then blah
i.e.
If and then blah
because then you assume the presupposition before assuming .
On the other hand, in 71 in the other thread, I think Mike’s suggestion is to work with (unless we were inconsistent about which one is the primed turnstile, in which case it’s , but these are equivalent). But doesn’t seem like a good assumption for a theorem, because it’s trivially true whenever the presupposition is violated (in this case n = 0), even if weren’t true, so the theorem would have to handle that case.
Maybe the difference is that you were thinking of as a conclusion of a theorem? in which case I agree that a typical theorem would likely assume , or something else that implies it.
I still need to think some more about the proposed local totality theorems though.
I agree with your definition of . What I meant was that although in fact is the inductive property and , we think of as the property we care about and think of as meaning . I’m not suggesting we actually work with , it’s just how we think about the meaning of . (Of course it’s not exactly the same as , because isn’t necessarily true, but the point is that we should only consider ourselves allowed to conclude something from if we also know that .)
I think of the local totality theorems as matching this intuition.
In case it is helpful, let me give an analogy to the local totality theorems using Foo. Suppose we want to define a function from natural numbers to, say, lists of integers, and prove that whenever holds then holds. With the definition , how can we do that? If we think of as meaning , so that is like , that suggests we find some other property in the world of lists corresponding to as a “stage” on the way to . For instance, we could prove that (a) whenever then has even length, and (b) whenever , if has even length then . Then putting these together, we get that if then , hence .
Here is syntactic context validity, is the (3)-style judgments, is the partial interpretation, is semantic context validity, and is definedness of the partial interpretation. So the local totality theorem has the form “given a derivable syntactic judgment, insofar as the semantic context is valid, the interpretation of the term is defined.” It’s not a perfect analogy by any means, but I think it conveys the important intuition I have.
Re #79 In my understanding of the style we are currently setting up the rules (the (3)-style ?), the rules for unit would be
and
Locally to the rule there is no assumption that is valid, in the sense that we can derive this for an arbitrary context (not even syntactically well scoped).
However, given a valid context (that is a raw context with pairwise distinct variables and a list of derivation – in particular is well-scoped – that I abbreviate ), we can now derive both the pair and the triple which are the actual objects of interest (In the sense that there should be a bijective correspondence between these pairs/triples and the derivation obtained by the (1)-style).
TBH, it is the first time I’m seeing such a presentation, and I’m not sure whether it will go through, but it has 2 really nice aspect to it :
Re #53, #67 I have now doubts about the necessity of the premise in weakening : If I am given a derivation of , I can build by structural induction on (the induction happening mutually for the 5 kinds of judgments) a new derivation of for and any raw term , even if is valid but is not and vice-versa. The reason is that we don’t use expect the context to be valid to derive a judgement (even though at the end of the day, we are only interested in the derivables judgements whose context is valid).
So the rule
should be admissible (where is distinct from any variable in so that is indeed a raw context). However if we want a corresponding weakening rule on pairs for instance, we would need the more familiar
I’m quite puzzled by this (in particular the fact that there is no relation between being a valid context and being valid in the weakening rule), and wondering whether I made a wrong step somewhere. Does that seems right/wrong ?
I think #79 was about style (1). I agree (#81) that style (3) does seem to avoid the need for a mutually defined context validity judgment, breaking this particular mutually recursive loop (although there are other such loops: the type judgment, term judgment, and equality judgments still have to be mutually defined).
Re #87 Right, there are other loops, and here is the current situation (with only -types)
Of course, adding universes introduce other (necessary I think) dependencies :
So I guess, as long as we don’t have any construction binding/representing arbitrary context, we do not need to mutually define the context validity judgement.
Note that markdown automatically numbers all numbered lists starting from 1; so perhaps your references may not be displaying the way you intended?
I also don’t quite know what you mean by “depends on”.
Oh, does “depends on” mean “occurs as the conclusion of rules whose premises contain instances of”? That makes sense if I increment all your number references.
The additional dependencies that you mention “from universes” actually I think happen whenever you have any way to produce nontrivially dependent types, which includes universes but also for instance identity types.
@Mike Thanks, yes, the list was supposed to start at 0 (I edited it) and “depends on” means is used as a premise. And indeed any type former actually depending on terms will introduce the last two dependencies.
Shucks, Kenji’s absolutely right that our admissible structural rules don’t need context or telescope validity! I just checked my code. Sorry, Mike, I remembered wrong; you were right!
Here are my admissible rules:
This is de Bruijn indexes. The shifts up variables not under by .
Edit: I abuse the shift and substitute notation for contexts too. For contexts, it’s as if were .
I don’t think I was right about this – I’m as surprised as you are! I guess the point is that although we need to state the primitive rules in such a way as to maintain the “presupposition invariant”, whereby the validity of the conclusion context can be used to deduce (perhaps in order) validity of the premise contexts, we don’t need to force ourselves to state the admissible rules in a similar way?
I don’t think I was right about this – I’m as surprised as you are!
Huh?
Anyway, the sanity check still needs context validity. So maybe you weren’t entirely right? But my correction was incorrect, since I was thinking the structural rules needed it.
we don’t need to force ourselves to state the admissible rules in a similar way?
Well admissible rules are theorems. You can state them however you like if you can prove them. It’s evidently quite subtle what you need, if you don’t think about the actual proof. For the structural rules, you don’t need context validity, for the roughly same reason raw structural operations don’t need contexts, plus some luck. But you definitely don’t always get this lucky.
Re #78:
I don’t really agree that raw telescopes are raw contexts (unless raw is super-raw and doesn’t even keep track of free variables) — raw contexts don’t have free variables, whereas raw telescopes are relative to a context of free variables.
I see. But yes, currently, our raw syntax really is that raw.
There’s an oddity to it though, which is that raw contexts are constrained to have pairwise-distinct variable names. I don’t see any reason for that; the raw terms and types don’t track free variables in their (meta)type, so the development of raw syntax doesn’t itself rely on the constraint. And the context validity judgment enforces it redundantly.
I like raw syntax. The typing rules cook it all the way through. Anything less is undercooked. If it needs more cooking, it might as well be raw.
The difference between the top rule and the bottom rule is that the bottom rule re-checks , whereas with the obvious inductive definition of you don’t, which matches the convention we’re using elsewhere.
Right. And that’s what I meant that is stronger than needed. But think of programming. You have a subroutine that wants to use something you should already have, so you just pass it along. No need to take it apart on one end and put it back together on the other just because you can.
Also, if we followed your approach of assuming the presupposition along with the judgment. We’d simply assume .
But it turns out you don’t need either. There’s no cleverness in avoiding it; you just don’t need it.
Re #83:
I like “invariant” as a general term for (1), (2), or (3).
I don’t think what you’re calling “extrinsic presupposition” is the same as (3), which I think should be called “precondition” for symmetry with (1) being called “postcondition”. With (3), you don’t do anything to try to bundle things up with their invariant(s). In your Foo(n) example, you’d just use Foo. You’d also never use . I think “intrinsic presuppositions” and “extrinsic presuppositions” are both variants of (2). You had me convinced already though that (2) and (3) can be very similar.
As far as extrinsic presuppositions being a way to avoid induction-induction, first of all Foo is a bad example for this, because the intrinsic presupposition version of Foo still wouldn’t be inductive-inductive. Second, when you’re actually inductive-inductive, getting the induction-induction principle from extrinsic presupposition style is quite involved. Without UIP, I think it’s unsolved.
I think a unit type would go like this:
The typing rule should be synthesizing, and I think Dan’s “” was a TeXnical difficulty.
It looks like the exchange rule doesn’t need . Here’s a proof, using precondition-free weakening and substitution, and an “obvious” alpha renaming, which is actually the tricky part. Do not trust this!
(1 = hypothesis)
(where is fresh)(2 = weaken 1)
(3 = variable rule)
(4 = substitute 3 into 2)
(5 = alpha rename 4)
(same as 5, since is fresh)
The reason why the alpha rename doesn’t affect is because:
So the problem is that the context could be a bunch of nonsense, and could actually be just , why not? So I think the correct additional premise is , which ought to imply that the free variables of are entries of . And we know that isn’t an entry in because is a raw context, and it’d have duplicate entries if had . So maybe that constraint was a good idea?
But you see (drumroll): we could also just have as the premise! Who’d’ve thought it?
I originally had allowed raw contexts to contain duplicate variables. But then I wasn’t sure what to do about duplicate variables in defining the partial interpretation of contexts. I suppose we could just declare that if there are duplicate variables then the interpretation is empty. But that really makes it feel stupid to allow duplicate variables in the first place, if we’re throwing away that possibility “by hand” as soon as we start doing anything. And actually I think the same thing is happening with ill-scoped contexts only less explicitly: as soon as there is a type in the context containing a free variable (that doesn’t occur to the left of it), then the partial interpretation of that type has empty domain, hence the partial interpretation of any context containing it is empty.
This inclines me towards Dan’s point of view that a raw context should at least be well-scoped. This is a simple property that can be defined on raw syntax without any inductive-inductive problems, and since my motivation for using raw things is “pragmatic” rather than philosophical I don’t have a problem imposing it.
Should the substitution principle be stated with or ? It seems like to me, because that’s the “most general” judgement (and the equivalent of in the non-bidirectional system). If requires a definitional equality type coercion to have type , I should still be able to substitute it for a variabel of type .