# Start a new discussion

## Not signed in

Want to take part in these discussions? Sign in if you have an account, or apply for one below

## Site Tag Cloud

Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.

• CommentRowNumber1.
• CommentAuthorTobyBartels
• CommentTimeSep 10th 2018

Change text from ’m’ (since 2015!) to a real article.

• CommentRowNumber2.
• CommentAuthorTobyBartels
• CommentTimeSep 10th 2018

Now lowercase and with less punctuation

1. In the expanded form of Bayes’ rule, why did you write “$P(E|\neg H) - P(E|\neg H)P(H)$” in the denominator rather than “$P(E|\neg H)(1-P(H))$” or “$P(E|\neg H)P(\neg H)$”? It seems like a weird choice to me because it disguises the symmetry between $H$ and $\neg H$.

2. Great that you added something to the article! Not that it really matters, but surely the default name should be Bayes’ rule, with the apostrophe at the end, with a redirect from Bayes rule without the apostrophe?

• CommentRowNumber5.
• CommentAuthorTim_Porter
• CommentTimeSep 11th 2018

I would second that point about Bayes’. Punctuation that is not standard should be avoided (when it does not interfere with functionality).

• CommentRowNumber6.
• CommentAuthorUrs
• CommentTimeSep 11th 2018

Bayes rules!

• CommentRowNumber7.
• CommentAuthorTobyBartels
• CommentTimeSep 11th 2018

I left the apostrophe out of the page name for the same reason that it's left out of Stokes theorem. It may be ‘Bayes's Rule’ or ‘Bayes' Rule’, but it's also ‘the Bayes Rule’. So rather than argue about whether to use the old rule (singular nouns and names ending in an /s/ sound (or other sibilant) take only an apostrophe in the possessive, just like plurals formed by adding the letter ⟨s⟩ (and possibly other changes) do) or to use the new rule (all singular nouns and names take an apostrophe and an ⟨s⟩ in the possessive, no exceptions), we don't use a possessive at all, but use the name as an attributive noun.

The practice of using attributives rather than possessives is catching on mostly due to naming things after multiple people (compare ‘the Kelvin–Stokes theorem’ to ‘Kelvin's and Stokes's theorem’, which would be ‘Kelvin and Stokes's theorem’ if they had worked together), but not having to argue about possessives is another good reason. (But if you want to argue about possessives, then I say that it's ‘Stokes's’. Way back in 1926, Fowler wrote

It was formerly customary, when a word ended in -s, to write its possessive with an apostrophe but no additional s, e.g. Mars’ hill, Venus’ Bath, Achilles’ thews. In verse, & in poetic or reverential contexts, this custom is retained, & the number of syllables is the same as in the subjective case, e.g. Achilles’ has three, not four; Jesus’ or of Jesus, not Jesus’s. But elsewhere we now add the s & the syllable, Charles’s Wain, St James’s not St James’, Jones’s children, the Rev. Septimus’s surplice, Pythagoras’s doctrines.

and while we may revere George Stokes and Thomas Bayes and find poetic beauty in their mathematics, I still advocate using modern grammar when talking about them.)

• CommentRowNumber8.
• CommentAuthorTobyBartels
• CommentTimeSep 11th 2018

@Oscar

I didn't write $P(\neg{H})$, because I wanted it broken down into the simplest concepts. The same goes for expanding rather than factoring subexpressions, although that's not as significant. There are certainly intermediate forms of the rule, but I just wrote out the two most extreme forms.

Although come to think of it, $P(E|H) P(H) - P(E|\neg{H}) P(H) + P(E|\neg{H})$ might be an even better way to put the denominator than $P(E|H) P(H) + P(E|\neg{H}) - P(E|\neg{H}) P(H)$ (as I put it). That way, it's clearer that there are two ways to write it with factored subexpressions: $P(E|H) P(H) + P(E|\neg{H}) \big(1 - P(H)\big)$ (as you suggested) or $\big(P(E|H) - P(E|\neg{H})\big) P(H) + P(E|\neg{H})$.

• CommentRowNumber9.
• CommentAuthorTim_Porter
• CommentTimeSep 11th 2018
• (edited Sep 11th 2018)

@Toby: I agree with you. In fact I had been idly looking up Stokes and his history and noted ’the Kelvin-stokes theorem’ and realised the one can argue (as you did) that ’Stokes theorem’ is related to the use of ’the Stokes theorem’, i.e. as an adjective rather than a possessive. Yes I like that;-)

• CommentRowNumber10.
• CommentAuthorRichard Williamson
• CommentTimeSep 11th 2018
• (edited Sep 11th 2018)

Hehe, I just have never heard anyone say anything other than Bayes’ theorem or Stokes’ theorem (’the Bayes theorem’ actually sounds to me as though the definite article has been added for comical effect!) , but it’s not important, I don’t mind being outvoted :-).

• CommentRowNumber11.
• CommentAuthorMike Shulman
• CommentTimeSep 11th 2018

I’ve never heard anyone say something like “the Bayes theorem” when there is only one person named. It sounds weird to me, but I could live with it as a back-formation from the multi-person case. But I don’t see any grammatical justification for dropping the article.

• CommentRowNumber12.
• CommentAuthorTobyBartels
• CommentTimeSep 12th 2018

We don't put ‘the’ in page titles (except for generalized the, which is about the word). That's not an nLab thing or even a math thing; almost nobody does that. But you're right that one couldn't drop it in running text without bringing in the possessives. (English needs a definite determiner here, which could be the definite article or could be a possessive but cannot be an attributive noun.)

As for ‘Stokes theorem’, sometimes I use that in the plural! (See the title of Chapter 7 in these notes, for example.) The idea is that the specific theorems taught in Vector Calculus, such as the Kelvin–Stokes Theorem and the Ostrogradsky–Gauss Theorem, besides being special cases of the one overarching Stokes Theorem, can also be considered as separate Stokes theorems when treated individually.

• CommentRowNumber13.
• CommentAuthorRichard Williamson
• CommentTimeSep 12th 2018
• (edited Sep 12th 2018)

For what it’s worth, my feeling would be that we should try, on the main nLab, to be as unobtrusive as possible in our choices, irrespective of our personal preferences. Thus for example I would probably not look at the title at all if I went to look at Bayes’ theorem, which is why I don’t feel it’s important, but if I did I would certainly react a bit upon what I would interpret as careless grammar (remembering that people are not typically going to look here for an explanation). If that is the way that the majority of people would react as well, my feeling would be to choose something more conventional.

• CommentRowNumber14.
• CommentAuthorTodd_Trimble
• CommentTimeSep 12th 2018

to be as unobtrusive as possible in our choices

but if I did I would certainly react a bit

I think that’s a basic principle of good mathematical writing: not to distract the reader by calling attention to itself.

• CommentRowNumber15.
• CommentAuthorTobyBartels
• CommentTimeSep 12th 2018

Maybe I've just gotten so used to leaving off possessive suffixes in the names of theorems and the like (even though I'm not consistent about it), but nothing about ‘the Bayes Rule’ (or ‘Bayes rule’ as a title) would distract me. Do people feel the same about ‘Stokes theorem’ (which is how it first appeared here, nearly 8 years ago)?

For what it's worth, there is a little more explanation of the name in the article now: no treatises on grammar, but enough variety to get across the idea. In particular, the exact phrase ‘the Bayes Rule’ now appears.

• CommentRowNumber16.
• CommentAuthorMike Shulman
• CommentTimeSep 12th 2018

Yes; I would also prefer “Stokes’ theorem”.

• CommentRowNumber17.
• CommentAuthorRichard Williamson
• CommentTimeSep 12th 2018
• (edited Sep 12th 2018)

I too would prefer Stokes’ theorem, yes. But I do think what you have is fine really, I don’t feel strongly about it.

• CommentRowNumber18.
• CommentAuthorDavid_Corfield
• CommentTimeJan 6th 2019

If in HoTT, $\sum_{x:A} B(x)$ for propositions $A$ and $B$ is a type that can represent the more typical conjunction of dependency, “It’s raining, and doing so heavily”, rather than the kind we teach our logic students, such as “$2+2= 4$ and London is capital of the UK”, perhaps we can see Bayes Rule as resting on the relation between two ways to factor a dependent sum/pair:

$\sum_{x:A} B(x) \simeq \sum_{y:B} A(y).$

It has often been observed that probability is an extension of logic. I wonder if an intuitionistic bayesianism would be the path to pursue here.

• CommentRowNumber19.
• CommentAuthorAli Caglayan
• CommentTimeJan 6th 2019
• (edited Jan 6th 2019)

@David I don’t quite follow how you factored that. $A$ is a type and $B$ is a type family, so the second sigma doesn’t make sense to me.

• CommentRowNumber20.
• CommentAuthorDavid_Corfield
• CommentTimeJan 7th 2019

Sorry, I wasn’t being precise. I was imagining something like a subset/relation $R$ of $A \times B$ for two sets $A$ and $B$, and then thinking of $B(a)$ as the subset of $B$ which are $R$-related to $a$ in $A$. Then likewise for $A(b)$.

I can’t see that much has been written on extending Martin-Löf type theory with probabilities. I started compiling a list of where probability theory meets category theory/type theory. This includes

• Harry Crane, Logic of probability and conjecture, (pdf) (longer version in progress)

Crane considers a monad, $P$, acting on types-as-propositions to convert them into conjectures whose elements are corresponding pieces of evidence. Then we might have a proof of inference between two propositions, $f: A \to B$, acting on evidence for $P A$ to give evidence for $P B$. Plausibility comes before probability here.

Are there any other options? What precisely is one to assign a probability to?

• CommentRowNumber21.
• CommentAuthorUrs
• CommentTimeJan 7th 2019

I was imagining something like a subset/relation $R$ of $A \times B$ for two sets $A$ and $B$, and then thinking of $B(a)$ as the subset of $B$ which are $R$-related to $a$ in $A$. Then likewise for $A(b)$.

So then $R$ is a dependent type on $A \times B$ and the rule you want is the functoriality of $\Sigma$ along $A \times B \to B \to \ast$ which equals $A \times B \to A \to \ast$:

$\underset{a \colon A}{\sum} \left( \underset{b \colon B}{\sum} R(a,b) \right) \;\simeq\; \underset{b \colon B}{\sum} \left( \underset{a \colon A}{\sum} R(a,b) \right)$
• CommentRowNumber22.
• CommentAuthorDavid_Corfield
• CommentTimeJan 7th 2019

Yes, good. And, of course, as you have it, $R$ could be a dependent set, so a span. Then it’s easy to see how something like $P(a) P(b|a) = P(b) P(a|b)$ could arise, say, from a uniform distribution over $R$ elements.

But how to phrase things in terms either of the ’$\Gamma \vdash A$ is true’ or ’$\Gamma \vdash a: A$’ styles of presenting type theory?

I guess you can see why the Giry monad gets introduced, where $p: A \to Dist(B)$ corresponds to a conditional distribution. Then a probability for a proposition can appear as the value corresponding to an element of $Dist(\mathbf{2})$.

Have people mixed probability theory with dependent type theory or even HoTT? I suppose you could have $p: A \to Dist(B(x))$, where $B(x)$ depends on $x: A$. Have people looked for a version of the Giry monad for higher groupoids? At the very least one might want equivariant distributions.

We were considering the reader monad as a means of expressing random variables.

Perhaps probabilistic HoTT will just be a simple exercise in the adjoint logic program, when they finish the dependent type version.

A jumble of thoughts for the new year.

• CommentRowNumber23.
• CommentAuthorSam Staton
• CommentTimeJan 7th 2019
• (edited Jan 7th 2019)

Hi, Thanks for setting up the list and all the interesting links. Regarding dependent types, several things in probability do have a dependent flavour. For example, disintegration as often phrased looks a bit like it is something of the type

Dist (Sigma (a : A) (B a)) -> Pi (a : A) (Dist (B a))

(Quasi-Borel spaces are locally cartesian closed, but we haven’t investigated that properly yet.)

• CommentRowNumber24.
• CommentAuthorDavid_Corfield
• CommentTimeJan 8th 2019
• (edited Jan 8th 2019)

Re #23, this would seem to bring us close to the dependent linear De Morgan duality (Prop. 3.18, p. 43) of Urs’s paper

Perhaps that’s not surprising if we take $Dist(X)$ as some subobject of $[X, \mathbb{R}^{\geq 0}]$.

People are certainly thinking of $[0,1]$ playing a dualizing role

The role played by the two-element set $\{0,1\}$ in these classical results—e.g.as “schizophrenic” object—is played in our probabilistic analogues by the unit interval $[0,1]$ (The Expectation Monad in Quantum Foundations, p. 2)