Not signed in (Sign In)

Not signed in

Want to take part in these discussions? Sign in if you have an account, or apply for one below

  • Sign in using OpenID

Discussion Tag Cloud

Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.

Welcome to nForum
If you want to take part in these discussions either sign in now (if you have an account), apply for one now (if you don't).
    • CommentRowNumber1.
    • CommentAuthormaxsnew
    • CommentTimeSep 28th 2017

    I’ve started writing duploid which generalizes thunk-force category to include negative types as well as positive.

    The missing concept for category with runnable monad as a version of thunk-force category for negative types was introduced in the same thesis as duploid but without much explicit definitions since it is immediately subsumed by duploids.

    More to come, especially the relationship to adjunctions.

    • CommentRowNumber2.
    • CommentAuthorMike Shulman
    • CommentTimeSep 29th 2017

    Interesting. I think you’re a bit inconsistent with the order of composition; did you mean for it to be diagrammatic or functional? Also should the third associativity law be (fg)h(f\circ g) \bullet h on the right?

    This looks to me almost like an instance of a category enriched in a bicategory, but the failure of general associativity is weird. I don’t quite understand what it means for a composition fgf \odot g to “evaluate ff first” or “evaluate gg first”; surely in computing f(g(x))f(g(x)) we have to compute g(x)g(x) before we can plug it into ff?

    • CommentRowNumber3.
    • CommentAuthormaxsnew
    • CommentTimeSep 29th 2017

    Thanks for finding those errors Mike, I’ve fixed them. Also I noticed that I flipped the convention from the thesis for ,\bullet, \circ so I swapped them back, and changed the names of the morphisms to be consistent too.

    surely in computing $f(g(x))$ we have to compute $g(x)$ before we can plug it into $f$?
    

    This is exactly the difference between call-by-value and call-by-name. In call-by-value g(x)g(x) is evaluated to a value vv then we evaluate f(v)f(v), but in call-by-name, the entire expression g(x)g(x) is substituted into ff.

    Most programming languages use call-by-value, in which case you would be right, but for example Haskell uses call-by-need (essentially an optimization of call-by-name for an idempotent effect). In haskell f is evaluated first as you can see if you run the following program (ignore the :: Int parts):

    let f x = error "f evalutated first"; g x = error "g evaluated first" in (f (g 3 :: Int) :: Int)
    

    error essentially crashes the program with an error message and you can see if you run this in https://www.tryhaskell.org/ it will print “f evaluated first”, whereas a similar program in most languages (Python, Java, Ocaml, scheme/racket) would print “g evaluated first”. The way this reduces is that the expression g 3 is substituted into the body of f, which doesn’t use it so the error g tried to raise is lost.

    Here’s a similar program that uses let instead of functions since that’s a more direct analogue to duploid composition:

    let x = error "Strict evaluation!" in (error "lazy evaluation" :: Int)
    

    you can run that to see what kind of language you are working in.

    I think the failure of associativity will make more sense once you see how duploids arise from adjunctions, which I’lll put up soon.

    • CommentRowNumber4.
    • CommentAuthorMike Shulman
    • CommentTimeSep 29th 2017

    Okay, but then why do we have f(gh)=(fg)hf\bullet (g\circ h) = (f\bullet g) \circ h? Shouldn’t the LHS evaluate gg then hh then ff while the RHS evaluates gg then ff then hh?

    • CommentRowNumber5.
    • CommentAuthorMike Shulman
    • CommentTimeSep 29th 2017

    I’m looking at the construction from adjunctions, and it looks weird too. I’m familiar with the graph of a profunctor but this seems to be something different; why are morphisms in 𝒟\mathcal{D} between positive types defined by applying FF to them rather than just using the morphisms in C +C_+?

    • CommentRowNumber6.
    • CommentAuthormaxsnew
    • CommentTimeSep 29th 2017
    • (edited Sep 29th 2017)

    Great questions.

    Shouldn't the LHS evaluate $g$ then $h$ then $f$ while the RHS evaluates $g$ then $f$ then $h$?
    

    That’s not quite how it works. It might be a better intuition to say that in call by name composition gfg \circ f, gg has control whereas in call by value gfg \bullet f ff has control. You can think of it as gfg \circ f means that f(x)f(x) is passed as an unevaluated thunk to gg whereas in gfg \bullet f, gg is passed as an “unevaluated continuation” to ff (continuation-passing style makes this literally true). So in f(gh)f \bullet (g \circ h) and (fg)h(f \bullet g) \circ h, gg is in control, it has a thunk h(x)h(x) that it can force (or not) but it also has a continuation ff it may run (or not).

    Again as concrete programs, if f,g,hf,g,h are all one-argument functions in a call-by-value language we can write gfg \circ f as λx.g(λy.f(x))\lambda x. g (\lambda y. f(x)) where yy is does not appear in ff and gfg \bullet f as just λx.g(f(x))\lambda x. g(f(x)). Then f(gh)f \bullet (g \circ h) looks like:

    λx.f(g(λy.h(x))\lambda x. f (g (\lambda y. h(x))

    and (fg)h(f \bullet g) \circ h looks like:

    λx.(λy.f(g(y))(λz.h(x))\lambda x. (\lambda y. f(g(y)) (\lambda z. h(x))

    which is the same after doing a β\beta reduction.

    I'm looking at the construction from adjunctions, and it looks weird too.  I'm familiar with the [[graph of a profunctor]] but this seems to be something different; why are morphisms in $\mathcal{D}$ between positive types defined by applying $F$ to them rather than just using the morphisms in $C_+$?
    

    Right, it’s similar to the cograph of a profunctor but different. The thing to remember is that, like in a thunk-force category, we never get direct access to the morphisms of the original categories, we only get them in the view of the “Kleisli category”/”co-leisli category”. Then after the fact we can recover them (to some extent) as the linear and thunkable morphisms. The intuition (for me) comes from effectful programming languages, where you don’t get a direct syntax for “pure” or “linear” arrows, only the effectful arrows. So a morphism between positive types isn’t a “pure” arrow from the original category C +C_+, but instead an “effectful arrow” from the Kleisli category O(P,FP)=C (FP,FP)=C +(P,GFP)O(P,F P) = C_-(F P, F P) = C_+(P, GF P)

    • CommentRowNumber7.
    • CommentAuthorMike Shulman
    • CommentTimeSep 29th 2017

    if f,g,hf,g,h are all one-argument functions in a call-by-value language we can write gfg \circ f as λx.g(λy.f(x))\lambda x. g (\lambda y. f(x)) where yy does not appear in ff

    What are the types of these things? If f:ABf:A\to B and g:BCg:B\to C, then it seems that λy.f(x)\lambda y.f(x) should have a type like YBY\to B (where y:Yy:Y), which is different from BB and hence is not the sort of thing that gg can be applied to.

    • CommentRowNumber8.
    • CommentAuthorMike Shulman
    • CommentTimeSep 29th 2017

    In general I’m having trouble figuring out exactly how to think about a morphism f:ABf:A\to B. It feels like you’re saying that if BB is positive, then a morphism f:ABf:A\to B is a function that eats an AA-value and spits out a BB-value, whereas if BB is negative then a morphism f:ABf:A\to B is a function that eats both an AA-value and a BB-eating continuation as arguments. Is that right? Does the intuition depend also on whether AA is positive or negative?

    Is there a real-world programming language you can point to that has this polarization of types? Or just a completely specified λ\lambda-calculus-like syntax?

    I’m used to thinking of polarity as a property of how a type is constructed, e.g. sum types are positive while function types are negative. But it seems that duploids are using polarity as a sort of marker placed on a type, like a “strict” modifier in a type signature, rather than a property of types that already exist. How are these notions of polarity related?

    • CommentRowNumber9.
    • CommentAuthormaxsnew
    • CommentTimeSep 29th 2017
    • (edited Sep 29th 2017)

    I would say the intuition is that in the most general situation (i.e., the most effects) a positive input is given as a value, but a negative input is given as a thunk. A negative continuation is a linear evaluation context but a positive continuation is given as an explicit continuation.

    Is there a real-world programming language you can point to that has this polarization of types? Or just a completely specified λ-calculus-like syntax?
    

    Yes, I probably should have brought this up earlier. Duploids are based on explicit term calculi that people have made for the sequent calculus, analogous to how λ\lambda-calculus models natural deduction. These usually go by the name “System L”, after Gentzen naming sequent calculi LK and LJ, though older work calls them System λμμUnknown character~\lambda\mu\mu\~. They were introduced in this paper, and types did not have an explicit polarity. System L is also presented in Guillaume’s thesis where he defined duploids (linked in the article).

    In particular in polarized L, what I said above about positive inputs being passed as values is literally true

    How are these notions of polarity related?
    

    This is something that I don’t quite understand well enough to say very definitively, I’m looking at duploids in part to understand all of this better. Guillaume’s thesis and this expository article about L-like calculi talks about what polarity means in the context of System L.

    My rough understanding is that since values of negative types are uninspectable thunks, it doesn’t really make sense for them to have a syntactic mapping-out universal property, instead you have to do it with their continuations. And dually for positive types syntactically their continuations are uninspectable continuations so you can only express a mapping-out universal property by the shape of their values.

    • CommentRowNumber10.
    • CommentAuthorMike Shulman
    • CommentTimeSep 30th 2017

    Note: you can quote previous comments without making them verbatim by putting a > at the beginning of the line instead of four spaces.

    • CommentRowNumber11.
    • CommentAuthorMike Shulman
    • CommentTimeSep 30th 2017

    If I am writing a function that takes an input of type AA, what difference is there in the kind of code I’m allowed to write based on whether my input is given as value or as a thunk? In lazy Haskell I write the same code as I would in a call-by-value language, it’s just that the argument doesn’t get evaluated until necessary. Is it that a thunk is something I can also choose to force even if I don’t need its value (whereas if I do need it then I have to force it)? Or that I can choose to force it more than once?

    I’m excited to hear there is a connection with sequent calculus! I don’t think I’ve heard before anything about a systematic study of term calculi for sequent calculus. Maybe if I have some time I’ll try to read some of that.

    • CommentRowNumber12.
    • CommentAuthorRichard Williamson
    • CommentTimeSep 30th 2017
    • (edited Sep 30th 2017)

    Fortran would be an example of a language that can typically be thought of call by reference in a more direct sense (i.e. I’m not sure that laziness should be taken as a prototypical example of call by reference). One certainly would typically code a bit differently, e.g. it is much more natural to make use of mutability. There is also of course a big difference in how things are working when one goes down to nuts and bots (i.e. in the compiled code, the machine instructions); it is easier to write optimised code in a more natural way in call-by-reference, I would say (e.g. one does not typically need pointers, as one would in C).

    • CommentRowNumber13.
    • CommentAuthorMike Shulman
    • CommentTimeSep 30th 2017

    I think call by reference is different from call by name.

  1. Ah, sorry, was not reading closely enough!

    • CommentRowNumber15.
    • CommentAuthormaxsnew
    • CommentTimeSep 30th 2017
    • (edited Sep 30th 2017)

    If I am writing a function that takes an input of type AA, what difference is there in the kind of code I’m allowed to write based on whether my input is given as value or as a thunk? In lazy Haskell I write the same code as I would in a call-by-value language, it’s just that the argument doesn’t get evaluated until necessary. Is it that a thunk is something I can also choose to force even if I don’t need its value (whereas if I do need it then I have to force it)? Or that I can choose to force it more than once?

    Typically the code you write looks the same but has different behavior. In “true” call-by-name, a thunk can be evaluated more than once, right so a program like:

    let x = print "0"; 3 in x + x
    

    would print “00” and return 6, whereas in call-by-value it would print just “0” and return 6. Haskell on the other hand is call-by-need so the first time the thunk is evaluated, the result is stored and returned if the thunk is invoked later. Traditional call-by-name λ\lambda-calculi would not have an explicit way to force something without using the value, but Haskell does support this because it is very useful for managing memory usage.

    Warning: minor rant incoming.

    It’s probably worth pointing out that there hasn’t been a “real” programming language to use true call-by-name since maybe Algol60 (as in 1960), since it’s easier to simulate call-by-name using thunks than the other way around and Haskell uses call-by-need and avoids any real effects besides divergence. However, call-by-name languages survived a lot longer in the denotational semantics world because there was a perception that they had “better” semantics because they have the β\beta and η\eta laws of the pure λ\lambda-calculus, whereas call-by-value only has η\eta for values (this also helped the perception of denotational semantics as not applicable to “real” programming languages). In category terms this is because the models of call-by-name are cartesian closed because cartesian-closure only requires mapping-in universal properties and so are inherited by the Kleisli category of a comonad. On the other hand, these categories don’t typically have sums, whereas call-by-value calculi do because mapping-out universal properties are inherited by the Kleisli category of a monad. Calculi like call-by-push-value and System L are an attempt to get the best of both worlds.

    • CommentRowNumber16.
    • CommentAuthorMike Shulman
    • CommentTimeOct 2nd 2017

    a program like: let x = print "0"; 3 in x + x would print “00” and return 6

    Can I ask a more naive question: why would one ever want that behavior? It seems highly counterintuitive to me, for instance, for the two expressions x + x and 2 * x to have such highly different behavior. In a language with side-effects, I would want to control precisely what code gets evaluated and what doesn’t, and how many times, rather than leaving it up to the details of how the values returned by that code gets used. Certainly there are situations in which one wants to conditionally evaluate an argument, but one can always deal with that by explicitly passing a function (thunk?) rather than a value.

    Perhaps a related question is why, when combining call by name and call by value, you would ever think of dividing it up by considering some types to be called by name and others by value? It seems to me that calling by name and calling by value is an orthogonal consideration to the type of the value in question, so I would expect rather to have a marker on each argument of a function, separate from its type, indicating whether it is passed by value or by name.

    • CommentRowNumber17.
    • CommentAuthormaxsnew
    • CommentTimeOct 2nd 2017

    Sorry if my comments are running a little long here.

    Can I ask a more naive question: why would one ever want that behavior? … In a language with side-effects, I would want to control precisely what code gets evaluated and what doesn’t, and how many times, rather than leaving it up to the details of how the values returned by that code gets used.

    I agree with you for a language that has IO effects, I’ve never programmed in such a language but it seems kind of insane. However consider a language like Haskell where we can idealize slightly and say the only effects are program-ending: a computation diverging and error "blah" that crashes the program. There the difference between an effect performed once or twice is immaterial because the program essentially crashes the first time.

    The advantage you get is that it makes more programs equal. Specifically, it means you can freely substitute an expression for a variable and know that the program does the same thing. In a call-by-value langauge you have to know a lot of context (that the program is pure, i.e., in semantic terminology thunkable) to know that this transformation is allowed. You can also freely re-order a sequence of let-bindings, which again takes a good deal of contextual knowledge to validate in call-by-value.

    These kinds of transformations are probably even more useful to compilers, which try very hard to move code around, and are much freer to do so in a lazy language like Haskell. In cbv compilers, they need to do static analysis to see if something is thunkable, which is quite expensive to do, and so most compilers will only do it for very obvious special cases (like constants). There’s a whole area of research on “fusion” about discovering these types of program transformations where multiple passes over a data structure are fused into a single pass, that are almost never valid transformations for cbv languages.

    Perhaps a related question is why, when combining call by name and call by value, you would ever think of dividing it up by considering some types to be called by name and others by value? It seems to me that calling by name and calling by value is an orthogonal consideration to the type of the value in question, so I would expect rather to have a marker on each argument of a function, separate from its type, indicating whether it is passed by value or by name.

    First, I would say that the shifts are exactly a marker you can put on a type to switch it from data to codata or vice-versa. Also I don’t think it’s orthogonal, you can only get the right η\eta rule for positive types/data if they are treated in a call-by-value manner and you only get the right η\eta rule for negative types/codata if they are treated in a call-by-name manner. Think of it as a modal distinction, some types naturally come from the “negative” category, some naturally come from the “positive” category and there are modalities (the shifts) that let you change categories.

    Also it should be said that these calculi are not necessarily being promoted as surface languages, but rather as better for automated tools like proof search/program synthesis and as an internal representation for compilers. In my experience they are quite hard to read, just like sequent calculus proofs are harder to read than natural deduction proofs.

    • CommentRowNumber18.
    • CommentAuthorMike Shulman
    • CommentTimeOct 2nd 2017

    Sure, I believe in Haskell. But in #15 you were specifically distinguishing between Haskell and “true” call-by-name, so I was wondering why one would want the latter.

    • CommentRowNumber19.
    • CommentAuthorMike Shulman
    • CommentTimeOct 2nd 2017

    Did you ever answer #7? If so I missed it.

    • CommentRowNumber20.
    • CommentAuthorMike Shulman
    • CommentTimeOct 2nd 2017

    I’m still resistant to the idea that associativity can fail. Intuitively, associativity in a category is a representation of the most basic property of time: if you do X and then Y and then Z, it doesn’t matter how you parenthesize it. So if I see some category-like structure that fails to be associative, it seems to me that it must be encoded as a category in the wrong way.

    • CommentRowNumber21.
    • CommentAuthorMike Shulman
    • CommentTimeOct 2nd 2017

    it seems to me that it must be encoded as a category in the wrong way.

    Or less combatively (-: that there should at least be some other way of encoding the same data as a category-like structure that is associative.

    In this case, it feels to me like if in gfg\bullet f it is ff that “has control”, then perhaps we should think of gg as being passed to ff rather than vice versa. Then f(gh)f\bullet (g\circ h) and (fg)h(f\bullet g)\circ h would both be some kind of multicategorical composition g(f,h)g \circ (f,h), whereas the unequal f(gh)f\circ (g\bullet h) and (fg)h(f\circ g)\bullet h would be something like fhgf \circ h\circ g and hfgh \circ f\circ g respectively (though with some multicategorical annotation that I don’t understand yet to indicate how things are plugged together).

    • CommentRowNumber22.
    • CommentAuthormaxsnew
    • CommentTimeOct 2nd 2017
    • (edited Oct 2nd 2017)

    Did you ever answer #7? If so I missed it.

    Oh I think I missed that one, your comment is right. In order to express call-by-name evaluation in a call-by-value langauge we have to use thunks, so if we want to have a call-by-name composite gfg \circ f, we need to change the type in the middle to be thunked: f:A(1B)f : A \to (1 \to B) and g:(1B)Cg : (1 \to B) \to C and change all the places where gg uses its input from xx to x(*)x(*).

    In this case, it feels to me like if in gfg\bullet f it is ff that “has control”, then perhaps we should think of gg as being passed to ff rather than vice versa

    Continuation semantics basically works like this, but I don’t know of anything (besides duploids) that is a more “direct” style model, like some sort of multicategory.