Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
I have created random variable with some minimum context.
In addition I have added pointers to Kolmogorov’s original book and to some modern lecture notes to probability theory and some related entries.
I have briefly cross-linked probability space with possible worlds, indicating a similarity of concepts and an overlap of implementations.
I thought a random variable was defined as a measurable function considered modulo almost everywhere equality.
That makes sense. The references that I listed at least don’t put it that way, though, but possibly others do. Which? Let’s add them, too. The entry clearly is still stubby.
About $[W,-]$ here:
This view isn’t compatible with $[W,X]$ giving the realizations of a Quantity in a space $X$, is it? The terms of $[W,X]$, if they are random variables, are fairly independent thingies, a bunch of things some of which may interest you and others don’t.
Also, where is the measure fixed?
Nikolaj, it is a bit hard to tell what you are trying to ask. In parts it sounds like you are trying to grasp the standard concept of random variables as in usage since the 1930s, in parts it sounds like you are wondering to which extent spaces of random variables are captured by the function monad.
On the first point I’d suggest to pick up any lecture note on modern probability theory, I have added some pointers, but there are arbitrarily more.
On the second point, I am aware that the very simplest phenomena in the universe may sometimes be as puzzling as the very complicated ones, but with a little concentration it’s not that hard:
measurable functions are the morphisms in the category of measurable spaces, random variables are measurable functions on the fixed configuration space $W$ with values in any other measurable space $S$ (see Dembo 12, def. 1.2.1) and hence it is just a fact, going back to a neat insight of Kolmogorov back in the 1930s, that a space of random variables is just a hom space $[W,S]$ in the category of measurable spaces. We may go to the sheaf topos over the site of measurable spaces to make it a genuine internal hom, if desired.
That’s it. Now just saying “function monad” of $W$ of course does not yet equip $W$ with a measure, if you want that you need to say “let $W$ have a (probability-)measure”.
More generally, what one really needs for a concept of probability and random variables is just any kind of space equipped with a concept of normalized integration. We could for instance go inside a cohesive $\infty$-topos that has an interval object $I$ exhibiting the shape modality. Then for abelian (stable) coefficients $S$ there is an abstract integration operation defined on $I$ (as discussed here) and so in cohesive homotopy theory the function monad $[I,-]$ has yet a bit more general abstract right to be pronounced as randomness.
Okay, thanks.
Nah, I have no problem with the standard notion of a random variable.
Okay, I see there isn’t a priori a notion of measure in your discussion. In general, in the second question I tried to gauge how the reading of [W,-] here matches that of hom-functors in space and quantity.
The abstract integration is of close interest to me and I’ll eventually look into it.
Related: Here’s the constant (100% deterministic) function $\eta_S('fixed s'):[W,S]$ applied to some $w$ for you :)
If you’ve never seen that, the unit of a Haskell monad is generally called “return” because monads where adopted in the 80’s mostly to implement the input-output-monad idea. Then runReader says that this return is that of Reader and not some other monad.
Some of the common notions, if you have to read this like in the article of Verdier you see here:
“fmap” is the arrow mapping of a functor, \x->f is the lambda term $\lambda x.f$ and the brackets are the tuples of a list. The type checker concludes that fmap must be that of the list functor mapping a function $Z\to Z$ to the evident one in $Z^6\to Z^6$.
Nikolaj,
thanks.
Sorry if my previous message sounded bad, I was maybe getting a bit frustrated that, while we can easily discuss super-sophisticated things here on the $n$Forum, the discussion often gets bogged down when one mentions the most elementary things, such as here the fundamental idea that a function on a space of possible outcomes may be thought of as a random variable. (Funny that by coincidence now a parallel thread opened where Connes is quoted as saying essentially this…)
I think it is important to distinguish the basic idea of a formalization of an informal concept from all the many ways of adding technical bells and whistles to that. The basic idea of Kolmogorov’s formalization of probability theory is just that, to consider functions on spaces of possible outcomes, and that texts about this go on about $\sigma$-algebras and whatnot is in some sense secondary.
Indeed I feel that the analogy with the Reader-monad in programming theory is most helpful for illuminating how excellent the general abstract point is:
from the point of view of inside a computer program, user input such as a Reader-monad is designed to model is exactly the spirit of random events coming in, the user being outside the deterministic rules of the program and providing input at will. That’s of course why key generation programs that need good random variables ask the user to hit some keys, that’s where they get their randomness from.
And as this shows, for this to work and have an interpretation in terms of random variables (and it does, this is real practice) there need not be a probability measure on $W$. Even if there is one, it doesn’t matter for the concept of random variable, as for instance uses in a pgp key generation program. A probabilty measure is only needed once one starts talking about expectation values of random variables and things like this.
Nikolaj,
while you are in Haskell mood: maybe we could make all this explicit by exhibiting the Haskell code that models programs depending on a single call to the actual random number generator $StdGen$.
I see that this is usually discussed with the state monad
Random a :: State StdGen a
so that one may read the random number and then replace it by a next one.
Let’s concentrate on the simpler case where we will just read the random variable and not update it to the next one. Then we’d replace the above by
Random a :: Reader StdGen a
I suppose, and then continue pretty much as before.
I suppose what I am asking is: could you type into the $n$Lab entry a paragraph with a handful of lines of Haskell code with brief comments, similar to what one finds in those online tutorials, that would show how to use the Reader monad for the above purpose of making everything in a program depend on one single random number?
That would be nice.
I’m not sure what you mean here (and I have to say add that I don’t write much in Haskell, I mostly used it to understand an actual implementation of computational notions and tools from math).
Coming back to what I asserted in #2: most online things I looked at don’t come out and say that. The most I found so far is here, 458A in Fremlin’s 4th volume. There is some passing remark by Terry Tao that some random variables come naturally packaged by an almost-everywhere definition (and then one can extend everywhere by setting to $0$ on the measure zero set), but the impression I get is that few people will complain about the definition currently given in the nLab.
Urs #8: for the case of a single call to the random number generator, I think what you mean is
Random a :: Reader Int a
In other words, a random variable of type $\alpha$ is modeled as a function $X : \Omega \to \alpha$ from the sample space, but in this case the sample space is just the size of a machine word.
I would say that the type
Random a :: State StdGen a
describes a qualitatively different use case, where one doesn’t have to specify ahead of time the size of the sample space. This is actually something Terry Tao talks about in his notes on probability theory, that there is a bit of a mismatch between classical measure-theoretic foundations of probability and actual practice, because when working on a problem of probability theory you often want to avoid fixing the sample space as long as possible.
added to the probability theory page a pointer to gromov’s (video recorded) lecture series on generalizations/modifications of probability (especially probability seen as a “”functor” from a “simple category” to a “complex category”“)* + applications of probability within and without pure math.
*in quotes both b/c I am quoting gromov and b/c gromov himself insists on putting that in quotes as it is work in progress and he is not preciscely sure what that means in the context of prob theory
Noam,
thanks. Wouldn’t
StdGen Int
give a type of random integers, in that it may be read out and changed consecutively, such as to produce sequence of random intergers. (I may be way off here, that’s what I gatherere form looking around unsystematically).
Could you give me the Haskell code for the program that generates a single random integer and then evaluates some function on the integers on it (say doubling it), and then quits – such that this uses the Reader monad but not the State monad?
@Urs,
’random integer’ in some specified finite set, presumably :-)
Urs,
btw. I do intent to translate some ideas here to code eventually, but I want to understand the what and where from dcct first, and there at everything simplices I had to hit the breaks and start reading up some algebra. I wrote you an email two weeks ago today, did you get that?
Nikolaj,
I got your email that started with “ich spiele lang mit einer Idee für ein Projekt”, yes. That was interesting, but it didn’t seem to be a question to me and so I didn’t know what a quick reply would be.
But let me reply here: the idea to use modal type theory for serious industrial physics simulation seems excellent to me in general, but immensely ambitious at the moment, since the first step that remains to be done is to set up some basic “kernel” in the first place. Once that works, it would seem to be time to think about applications. But of course if the second step motivates you to take the first, that would be excellent.
It may be worthwhile to ignore all the homotopy theory for the time being. It will be necessary to get “native” gauge theory, but from your mail I gather that your are thinking of applications of continuum mechanics govered by differential equations, and for that a modal 1-type theory may be sufficient, by and large.
Which language are you writing code in?
I was learning Haskell and the idea came up how the types could represent something physics in the sense I read here, it’s not that I think there’ll be an “industrial” code anytime soon. At the latest, I’ll talk to you in October in Vienna.
You had mentioned the word “industries” in that email. But whatever it’s called, I suppose what you have in mind is simulations of the kind where people traditionally use some finite element methods, solve difference equations on some lattice or other discritization, and so forth, right?
The idea would be, I suppose, to try to see if one could make progress by an approach where instead of specifying one fixed such way of discrete modelling of geometry to start with, one rather writes some more high level synthetic code in terms of an abstract (Haskell-)monad $\Im$ that is to be thought of as acting on types like an infinitesimal shape modality. (That is the most important modality for everything related to continuum physics.)
So one could try to write a library for simulating some flow dynamics, but now entirely abstractly, just in terms of such a monad.
In a second step one would then build an actual implementation of that abstract monad (Not sure what this is called in Haskell, does Haskell have “templates” as Java does? But you’ll know what I mean.), an implementation say on certain types of lattices, where $\Im$ would act by sending a lattice to something like the result of identifying first order neighbour vertices, or something like this.
Thanks for the links, that gives me an impression as to where you are.
I suppose a first step towards doing any modelling based on monads would be to warm up with some general yoga with monads.
For instance maybe this is a good first little item to aim for:
given any two Haskell functions $f \colon X \to Z$ and $g\colon Y \to Z$, write code that gives the fiber product type $X \times_Z Y$. (Hm, is something like this possible in Haskell??)
If this is possible, then next we’ll need it for this case: let $\bigcirc$ be any Haskell monad, and $f \colon X \to Y$ any Haskell-function, write code that gives the canonical function $X \to Y \times_{\bigcirc Y} \bigcirc X$. (Hm, is something like this possible in Haskell??)
I don’t know/I don’t see how, and maybe, but then it’s not completely straight forward. What you want here is is a strong comprehension alla “give me a type with only those terms such that” but Haskell is of course not dependently typed with sigmas and products.
Googling tells me this Agda implementation of HoTT has
https://github.com/HoTT/HoTT-Agda/blob/master/lib/types/Pullback.agda
and btw. there are packages for Haskell which provide a category theoretical framework, e.g.
https://hackage.haskell.org/package/categories
Right now I’m reading algebra, though, so I get my groupoids straight.
Okay, if there is no fiber product of types in Haskell, and if you are after the vision of that email of yours, then you need to switch away from Haskell to something more expressive NOW. :-)
Hey it’s not my fault that the CS folks are so slow ;) But yeah, it would be the only reasonable thing to do, to code in a suitable language, but there is no community to learn stuff from and if you learn Agda/Coq/Haskell spin off’s you only learn it for one project. Haskell actually got a very cool community and so that was what my choice of topic to learn about was motivated by. Related may be this SO question.
there is no community to learn stuff from and if you learn Agda/Coq
Is that so? (This is a genuine question, I might not know.)
The Hott Café group has recently been doing well, it seems to me. It seems to me that young people who pick up the HoTT book as their textbook and turn to the HoTT Café when getting stuck on the exercises are doing well.
I might be wrong. Their big advantage however is that they know for sure that they made the right choice :-)
They made the right choice for writing papers about HoTT, not to write code anybody will use ;)
If you actually want to write code in a dependently typed programming language, then Coq or Agda is the way to go. You can do a bit of faking of dependent types in Haskell with typeclasses, but it’s kludgy and awkward and starts to look unbearable once you’ve actually programmed with real dependent types.
[edit: this message overlapped with Mike’s above]
Nikolaj, either you are visionary and ambitious as in the email you had sent me, or you walk the well trodden paths with the crowd without a vision.
But you should look into Agda. It’s not really used much yet for real-world programming, it seems, but it is close to Haskell. And so maybe you will be the one to use it for real-world programming, first.
In any case, I don’t see how you can at the same time say you are interested in applying modal type theory in programming, and at the same time constrain yourself to languages that are guaraneteed not to admit that. That’s self-contradictory.
Mike, is there by now a convenient bundled package that one may download and straightforwardly unpack and compile to get up and running with either Agda-HoTT or Coq-HoTT?
Back then I once had spent some hours (or days) getting Coq installed and everything, and when finally done I ran into that darn universe inconsistency and was told that I had to patch the binaries. At that point I gave up, for such a time sink makes me nervous given that it’s already outside of what I am supposed to be doing.
One day I’d like to try again. But only when meanwhile the installation has been streamlined a bit more.
I don’t know about Agda. For Coq, it’s no longer the case that you have to patch the binaries; the library works with Coq 8.5beta. At present, I believe the install script does still compile Coq from source, but it should do it all for you with just a few commands. Once Coq 8.5 is officially released, we may put together an even easier installer than works with the standard distribution copy of Coq.
Thanks for the info. And another question: a while back you had announced working on a rewrite of the libraries for modal operators. Has that ambition found its realization meanwhile?
I am just asking because in view of people like Nikolaj who have ambition to try to start experimenting with programming (“programming”) in modal type theory, it would be great if one could just hand them a software bundle and say “here you go, use this”.
Let me know if you come across people like me :D Regarding #27., I’m visionary enough, I’ve also been reading into the HoTT book already when it came out, and I want to use use languages eventually. But it’s not like I can just sit down and code when I can’t claim to understand the math or the programming practice good enough. Given that nobody will give me anything for getting into this - it’s my interest - I need to look out to learn stuff that I’ll use more globally. It’s a question of practicability and that’s why it can only be a long term thing. But as you see, I lurk the nforum consistently enough and I read my books ;)
Urs, why do you say that random variables are functions? They are equivalence classes of functions at best, but perhaps better they should be elements of some abstract algebra. You know the relevant commutative von Neumann algebra theory.
why do you say that random variables are functions?
That’s the standard definition since Kolmogorov 1933, as we discussed above.
Urs #33: I believe Bas Spitters is confirming something along the lines of what I said in #2. Even though I did not succeed in finding a good online citation, I have a sneaking suspicion he could, and it would be good to be scrupulous on this point which is a fair one.
Yes, I know and agree, as I said in #3. But Bas asked me why I said what I said, and I replied to that truthfully.
The entry should be expanded accordingly, adding the statement that to the extent that one is interested in forming expectation values and moments, then random variables that differ on measure-0 are naturally identified.
However, in light of the point I was trying to drive at here, I think that in fact a function on a space that is not even equipped with a measure but that is being thought of as a space of “possible outcomes” of sorts may already be usefully thought of as a random variable, and that this is in fact common practice in various corners. Therefore I think the point about almost-everywhere identification is somewhat secondary to the heart of the concept. Nonetheless, it deserves to be discussed in the entry. Anybody who feels energetic is invited to do so.
The point I want definitively cleared up is: what’s the standard definition? Since Bas brought it up again, I join him in asking whether what you say is standard actually is. But I’ll let others more expert than me weigh in.
Urs #13:
With the disclaimer that I am not a real Haskell hacker either, here is a short example.
Consider the following functions, written in Haskell do notation:
rollDie :: State StdGen Int
rollDie = do
rng <- get
let (d,rng') = randomR (1,6) rng
put rng'
return d
rollDice :: Int -> State StdGen [Int]
rollDice 0 = return []
rollDice n = do
d <- rollDie
ds <- rollDice (n-1)
return (d : ds)
The expression rollDie
denotes a computation which will query the standard random number generator once to produce an integer between 1 and 6, while rollDice n
denotes a computation which will make n calls to rollDie
to produce a list of integers. These computations are suspended inside the state monad, though – if we want to get actual random executions we have to plug these expressions with the system random number generator explicitly, e.g., with something like this at the top level:
> getStdRandom (runState (rollDice 20))
[3,6,5,4,3,1,5,6,2,1,1,2,3,4,4,2,1,2,3,2]
> getStdRandom (runState (rollDice 20))
[2,6,5,3,5,6,4,2,5,6,5,3,5,5,5,4,5,6,5,2]
The expression getStdRandom (runState (rollDice 20))
has type IO [Int]
.
Notice how the code for rollDie
explicitly updates the random number generator (put rng'
) after making the query. Suppose we had forgotten to do that:
brokenrollDie :: State StdGen Int
brokenrollDie = do
rng <- get
let (d,rng') = randomR (1,6) rng
return d
brokenrollDice :: Int -> State StdGen [Int]
brokenrollDice 0 = return []
brokenrollDice n = do
d <- brokenrollDie
ds <- brokenrollDice (n-1)
return (d : ds)
Then the semantics would not be quite as intended:
> getStdRandom (runState (brokenrollDice 20))
[3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3]
Now to your question:
Could you give me the Haskell code for the program that generates a single random integer and then evaluates some function on the integers on it (say doubling it), and then quits – such that this uses the Reader monad but not the State monad?
We can write a function that will transform something in the Reader monad into a probabilistic computation:
sample6 :: Reader Int a -> State StdGen a
sample6 x = do
omega <- rollDie
return (runReader x omega)
This makes sense if we view x as a random variable over the sample space {1,2,3,4,5,6} with the uniform distribution. For example, we can define a random variable standing for half a Fibonacci number:
randomFib :: Reader Int Integer
randomFib = do
let fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
i <- ask
return (0.5 * fibs !! (i-1))
and then we can sample it:
> getStdRandom (runState (sample6 randomFib))
0.5
> getStdRandom (runState (sample6 randomFib))
2.5
But of course we’re still using the State monad when we do this sampling, and it’s still important that the system random number generator gets updated on each call. For example, if we had instead written…
brokensample6 :: Reader Int a -> State StdGen a
brokensample6 x = do
omega <- brokenrollDie
return (runReader x omega)
then, again, this is probably not what we expected:
> getStdRandom (runState (brokensample6 randomFib))
1.0
> getStdRandom (runState (brokensample6 randomFib))
1.0
> getStdRandom (runState (brokensample6 randomFib))
1.0
> getStdRandom (runState (brokensample6 randomFib))
1.0
> getStdRandom (runState (brokensample6 randomFib))
1.0
Yes, the new libraries for modalities are pretty much done.
Noam, thanks.
What you call broken and unexpected goes in the direction of what I was asking for. While this may be unexpected to somebody with certain expectations, it is not broken. To see this, consider the case that instead of the type of integers we work with the type of lists of 20 integers. Then a program that instantiates a single random instance of that type once and outputs it yields the same output as the programs that you consider as unbroken above.
I suppose that way you can indeed write a Haskell program that does not ever mention the State monad, just the Reader monad, and does produce unbroken-looking random output.
I understand what you’re saying, and I think it would be great if people worked out a way of programming with random variables that was directly connected to the classical Kolmogorov definition. What I see as the issue with working in a fixed Reader monad, invoking Tao, is that when we program/prove with probabilities it is difficult to know the sample space in advance, and besides, doing so breaks modularity (we write programs/proofs which are difficult to compose). That’s why people usually program either with random number generators (in which case random variables are interpreted as stateful computations with the sample space implicit) or with the monad of probability distributions (in which case random variables are interpreted as distributions and the sample space disappears).
Instead of fixing a global sample space for all random variables, we would need machinery for taking random variables which may or may not be defined over different sample spaces and plugging them together. That has a lot of the feeling of linear type theory, in the broad sense.
Thanks. I am not genuinely concerned here with the practice of real-world programming with real-world demands. I just want to substantiate the claim that the reader monad may be thought of as producing spaces of random variables over a fixed, yes, space of possible worlds $W$, by pointing out (if true, but I suppose it is) that every Haskell program whose output is conditioned on a single random instance of type $W$ naturally “is inside” (or whatever the jargon is) of $Reader W$.
Given the discussion on the proper concept of random variable that we are having here and elsewhere, I thought and still think that, simplistic as it is, this is a useful illustration to keep in mind and comes as close as it gets to be “proof” that the Reader monad $\prod_W W^\ast$ deserves to be pronounced “randomness” in direct analogy as to how its sibling $W^\ast \prod_W$ deserves to be pronounced “necessity”.
Also, I am wondering if restriction to a fixed space of possible worlds is really that unheard of in practice. Here is an example I am thinking of where it would be just the right thing:
at the LHC, before it started operation but still these days, they are running competitions where contestants get handed a program that behaves like displaying LHC detector data, only that the data is randomly generated, and the user’s task is to draw the correct conclusions about the particle physics that would cause this data.
Such a program would be run once by every user (disregarding the user saving its state to disk and continuing the next day) , and would once at the beginning generate its random data, sending one copy of it back to the contest panel for them to archive. (Or at least that would be a natural way to set this up. )
Along these lines, another good picture to keep in mind for what I am after is to abandon your latently Newtonian worldview (where all events are progressing in time, and you may generate your random numbers in linear order one after another) to the proper Einsteinian one where spacetime is one global object that is not to be time sliced. A model for a random Newtonian spacetime you could program by consecutively reading out new random numbers, but for a program modelling Einstein spacetimes (computational cosmology!, causal dynamical triangulation!) you’ll need something closer to generating all the relevant random data “before the beginning of time”, if you see what I mean.
One might think me rambling here is all in the vein of “too much philosophy on the nForum as of late”, but given the discussions that are still going on, this is maybe more justified than one might have thought.
regarding the issue around #22 of Haskell not having proper dependent types, there is a new comment on that by Dan Doel on g+here
Mike is right there, in that I was mostly saying that it means one is learning the dependent language for a single applicaion and then won’t use it in the context of projects with other people. I’m looking into in anyhow (on the side, given that I’m finishing a vastly unrelated PhD in plasma chemistry atm.)
1 to 43 of 43