# Start a new discussion

## Not signed in

Want to take part in these discussions? Sign in if you have an account, or apply for one below

## Site Tag Cloud

Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.

• CommentRowNumber1.
• CommentAuthorTim_Porter
• CommentTimeMar 9th 2010
• (edited Mar 9th 2010)

Zoran has raised an important issue in Cech methods, namely that we should use the correct accented names for titles of entries (with redirects any difficulties that might occur can be avoided. Some people may find the accents difficult to produce on their machines.) For example ?ech should be used instead of Cech. (Note this comes out correctly at the proview, but on recalling it to edit it had become ?ech. Most annoying. )

I feel that he is right and would propose that wherever feasible this should be done. The problem is that in things like TOCs and sidebars, there are parsing errors. (see ?ech cohomology. These do not occur with the unaccented name. Is there a way around this? Side bars are nice but could be omitted in such cases, or is there a better resolution of the problem.

I will change back the few that I changed recently. These are especially on stuff that I have been handling in the main.

Does anyone have the link to where all the usual accented characters can be found? And which are possible for this forum as I had some difficulty with one that I needed.

• CommentRowNumber2.
• CommentAuthorTim_Porter
• CommentTimeMar 9th 2010

This is weird! I thought that the other side bars (e.g. other entries in cohomology contents) would be similarly effected but no they are fine and the ?ech cohomology entry reads perfectly correctly. It seems it is just the ?ech cohomology entry itself that is wrong!

• CommentRowNumber3.
• CommentAuthorHarry Gindi
• CommentTimeMar 9th 2010

Speaking of that article, would someone mind writing something up about classical Cech cohomology? It seems like the article jumps into the HTT case immediately.

1. the classical Cech complex and its cohomology are treated in the section "Abelian Cech cohomology" in Cech cohomology
• CommentRowNumber5.
• CommentAuthorTim_Porter
• CommentTimeMar 9th 2010

Now then Toby what did you do? It works properly now!

• CommentRowNumber6.
• CommentAuthorTobyBartels
• CommentTimeMar 9th 2010
• (edited Mar 9th 2010)

I fixed the sidebar at Cech cohomology by adding in a blank line before it; sometimes that helps. (I also fixed the duplicate page.)

The reason for avoiding accents in the naming conventions is not about limitations in our software, but instead about limitations in the users' software. Most people using keyboards designed for use outside of eastern Europe have trouble typing the accented character. This doesn't matter so much for pages that already exist, since we have redirects; we just need to put the redirect in. It matters more for pages that don't already exist, such as Cech groupoid (currently nonexistent but linked from Cech cover).

Suppose that a link from one page (say, Cech cohomology) were written with the accented character, while the link on the other page (say, Cech cover) were written by someone (person X) with a western keyboard who doesn't know how to get accented characters. (That's not really a problem on these pages, since they are peppered with the name with the accented character, which could be copied and pasted. But they didn't always look like that.) And suppose that X then creates the new page. X doesn't know how to get accented characters, and X may not even realise that a link to the accented character exists on another page. So X cannot create the redirect either. Meanwhile, someone else (Y) comes along to Cech cohomology and follows the link to the page with the accented character to create another version of the new page.

The purpose of naming conventions is to prevent duplicate pages. Of course, we can correct this when it happens, so it is not a disaster; but it is something to avoid if possible. The only convention that is easy for everybody to follow is the one without accents.

Notice that this is not only about eastern European accented characters. There are many examples:

All these, in addition to the lowercase singular American English rule.

It is also possible to diminish the influence of the unaccented page name by putting an accented title up top, just before the table of contents (if there is one); use # for this title, and use ## for the largest other headers (etc) to make the title appear in the text but not in the TOC. I have just done this for all of the ‘Cech’ pages.

• CommentRowNumber7.
• CommentAuthorTobyBartels
• CommentTimeMar 9th 2010

I'm not quite as inflexible as my previous comment makes me sound, but I wanted to lay out the full case for unaccented page names. In summary, we pick a convention to avoid creating duplicate pages, and we pick the one which is easiest for everybody to follow; we can pretty up the page afterwards.

(If you saw my previous comment while I was editing it, I'm now done editing it.)

• CommentRowNumber8.
• CommentAuthorTobyBartels
• CommentTimeMar 14th 2010

Mike brainstormed

Just brainstorming... what would people think about enforcing the ascii naming convention only for nonexistent pages? Since existent pages can be redirected, it doesn't seem as though having the "correct" name for the actual page is as bad there. And once a page is created, it can then be renamed by someone who can type unicode and have the ascii title redirected to it (or if the person creating it can type the correct characters, then they can just create it with the right name in the first place, and with a redirect for the ascii name). Kind of like how we can use links like categories for existing pages that have redirects, but for nonexistent pages we are supposed to write categories or groups so that if someone creates the page from that link, it will have the correct name. Only not quite the same. But maybe that would be too complicated a convention for people to remember?

If you want something analogous to writing categories and groups originally but categories and groups later, then we already have that. You have to write Cech groupoid now (since the page doesn't exist yet) but can write ?ech cohomology (which won't show up right on this Forum, but you know what I mean) since the page already exists (and has the proper redirects put into it).

The first half of your brainstorm is different. I've thought about it, and it works as far as the pages in question are concerned. The downside to it is second-order: since people learn to use wikis by copying what's already there, having pages at ?ech cohomology or categories directly (instead of at Cech cohomology or category, with redirects) encourages them to make new pages the same way. Of course, we already risk this by having people link directly to ?ech cohomology or categories after the pages are created and have the redirects in them. The question is how to balance convenience, good looks, and encouraging good practice.

• CommentRowNumber9.
• CommentAuthorMike Shulman
• CommentTimeMar 15th 2010

Yes, I see your point. (Although of course it's not exactly the creating of new pages that's the problem, but the creating of links to pages that don't yet exist.) I don't know what the right balance is.

Thinking about it more, the copy-paste solution might not be so bad for when I'm at an unfamiliar computer. Or at least, with the special characters page, the change we're considering here wouldn't make working at such a computer noticeably more inconvenient than it already is (e.g. it's already inconvenient to use special characters in (redirected) links to existing pages, which I probably do a lot more frequently than I create links to nonexistent pages containing special characters).

Can we take a survey of all nLab authors? How easy is it for you to type special characters like $\infty$ and the C in Cech?

Can we also have a page of instructions on how to set up various operating systems to make it easy to type such special characters? I can contribute instructions for Emacs (yes, of course that's an operating system!).

• CommentRowNumber10.
• CommentAuthorEric
• CommentTimeMar 15th 2010

One thing to be aware of when using redirects (which I hope gets fixed one day) is that you lose the "Linked from:" list at the bottom of the page. I think this list is helpful, so it is a shame that redirects do not (yet) keep track of it.

• CommentRowNumber11.
• CommentAuthorTobyBartels
• CommentTimeMar 15th 2010
• (edited Mar 15th 2010)

@ Mike

(Although of course it's not exactly the creating of new pages that's the problem, but the creating of links to pages that don't yet exist.)

Can we take a survey of all nLab authors? How easy is it for you to type special characters

Most of the time it's pretty easy when I'm on my own computer, using SCIM if not the compose key. (I keep meaning to figure out exactly what I installed to make SCIM work so well for me, Mike. Since we both use Ubuntu, I could just give you a list of all of my installed packages, but that seems like overkill.) On another computer, I use the Unicode charts to find the &;#x...; notation, which is not very convenient.

• CommentRowNumber12.
• CommentAuthorAndrew Stacey
• CommentTimeMar 15th 2010

One word of warning. I just tried this out on doriath and found that in page names, there is a difference between the unicode character itself and its entity description. Take a look at the examples on HomePage (doriath).

I wondered about whether or not something like detexify would be nice for this. According to that person's notes, someone's had a go at adding unicode to detexify (presumably so that it tells you the unicode character as well). So you could scribble your attempt at a character and detexify would tell you what the nearest unicode was.

(maybe if we offer hosting at mathforge for detexify as an incentive ...)

• CommentRowNumber13.
• CommentAuthorAndrew Stacey
• CommentTimeMar 15th 2010

Slow afternoon ...

http://www.math.ntnu.no/~stacey/code/latexToUTF/utf.php

Currently accepts: single latex commands, either of the type that produces an accent (like \'e) or that produces a symbol (doesn't distinguish between mathematics and text modes). Accents work as \v{C} or without the curly braces if the command is punctuation (so \^c works). The results are produced as named entities which your browser should convert to unicode for cut-and-pasting. It shouldn't be hard to make it produce unicode natively if desired.

• CommentRowNumber14.
• CommentAuthorAndrew Stacey
• CommentTimeMar 15th 2010

Re: two comments above. The examples for my warning actually lie in the Sandbox (doriath) not on the homepage.

• CommentRowNumber15.
• CommentAuthorTobyBartels
• CommentTimeMar 15th 2010

@ Andrew #12

Right, you're supposed to copy and paste the symbol itself. So for your examples involving ‘‱’ on Sandbox (doriath), the latter example is what one would use.

Although if one is creating redirects, then one could do the former as well; why not? I've created some redirects with ‘?’ in the name to handle links from the Forum too.

• CommentRowNumber16.
• CommentAuthorMike Shulman
• CommentTimeMar 15th 2010

Andrew, that converter is neat! I think that'll usually be even more convenient than searching through a copy-and-paste page.

Toby, I think I have SCIM installed at least -- it seems to be standard on Ubuntu nowadays. And in my System > Preferences there is something called "SCIM Input Method Setup." But I haven't found any documentation on how to use it, and it doesn't seem to work. Nothing I do in that setup application appears to have any effect on the rest of my computer.

• CommentRowNumber17.
• CommentAuthorAndrew Stacey
• CommentTimeMar 16th 2010

Added an ajax-facility to the converter so that clicking on "ajax" appends the new character to the currently displayed list. That way, one can build up a character table for ones own use.

(I got annoyed having to keep redoing it when switching from Fréchet to Frölicher and back again)

• CommentRowNumber18.
• CommentAuthorAndrew Stacey
• CommentTimeMar 16th 2010

Incidentally, what's going wrong with the links from the Forum? I can enter unicode fine (just did) so is it something to do with the links themselves?

• CommentRowNumber19.
• CommentAuthorTobyBartels
• CommentTimeMar 17th 2010

You can enter Latin-1 characters like ‘é’ and ‘ö’ just fine (and some others, like the quotation marks that I used around them), but not most other characters, such as ‘?’ and ‘?’. (They show up in preview but become question marks when posted.) However, you can also enter things as SGML numerical character entities, and that works.

• CommentRowNumber20.
• CommentAuthorHarry Gindi
• CommentTimeMar 17th 2010

Linux users have the compose key, and I found a cool little application for windows that enables the compose key called allchars. ßéè?

• CommentRowNumber21.
• CommentAuthorMike Shulman
• CommentTimeMar 17th 2010

Allchars looks nice for accents, but it doesn't seem to do $\infty$, does it?

• CommentRowNumber22.
• CommentAuthorHarry Gindi
• CommentTimeMar 17th 2010
• (edited Mar 17th 2010)

compose + oo = ?, which for some reason fails to display on the nForum but works otherwise.

Notice that it works on meta.mo. I've added it to the end of that post.

I've also added it to the top of the sandbox.

Something must be wrong with the nForum (maybe with the encoding settings?).

• CommentRowNumber23.
• CommentAuthorAndrew Stacey
• CommentTimeMar 17th 2010

Something does seem a bit funny with the encoding settings here, but I tried a test at the test site (nForumMathML) and it seemed to work (please, someone try this) so I'm not too bothered about figuring out exactly what's going wrong here.

• CommentRowNumber24.
• CommentAuthorAndrew Stacey
• CommentTimeMar 17th 2010

Back to the main issue, it seems that no-one objects to Zoran's basic point: that page names should reflect the proper name and redirects should be used to ensure that asciified versions also work, at least in so far as the proper name is readable (thus I think that Voevodsky's page should be "Voevodsky") by everyone. Certainly, pages with the correct accents and so forth should not be renamed to pages without.

There will always be the problem of naming pages that don't yet exist (particularly as one might not know that they don't yet exist when writing the page), and there will always be problems with people not being able to enter strange characters (or not knowing that the proper name has those strange characters!). And since we can rename and redirect pages, although I would like to encourage people to get it right first time, we shouldn't be too hard on people for getting it wrong.

That's my thoughts.

• CommentRowNumber25.
• CommentAuthorMike Shulman
• CommentTimeMar 17th 2010

Everyone is saying that linux has a Compose key, but I use linux and I don't have a Compose key. How do I get one?

• CommentRowNumber26.
• CommentAuthorMike Shulman
• CommentTimeMar 17th 2010

Ok, I figured it out; I need to xmodmap something to a Multi_key. But it doesn't look like I can get $\infty$ that way, according to these lists.

We could encourage people to create stubs (with redirects) for pages with non-ascii names, rather than leaving the links hanging. That would reduce the danger of duplicate pages being created from different links.

• CommentRowNumber27.
• CommentAuthorMike Shulman
• CommentTimeMar 17th 2010

Strangely, to get the ? in Cech with the compose key, I have to type Compose+lessthan+C, rather than the more intuitive (to me) Compose+v+C.

• CommentRowNumber28.
• CommentAuthorHarry Gindi
• CommentTimeMar 17th 2010

You can set up the compose map yourself in Linux.

• CommentRowNumber29.
• CommentAuthorTobyBartels
• CommentTimeMar 17th 2010

@ Mike #27

And I have to type Compose-c-C; neither of the others works. (‘c’ for ‘check’, I guess; I also have ‘b’ for ‘breve’.) Clearly there are different systems. (I guess that you linked to two different systems in #26, so that explains it.)

@ Andrew #24

I objected in #6, and I'd at least like to know where to draw the line between ‘Čech’ and ‘Воеводский’. (Whether the alphabetic characters come from one of Unicode's Latin code blocks, I guess.)

Do I have it right that you and Zoran are proposing a change in naming conventions that affects the first two of the five examples in my comment #6, but not the last three?

• CommentRowNumber30.
• CommentAuthorAndrew Stacey
• CommentTimeMar 18th 2010

I would draw the line at the following point: can I, as a non-linguist, read the text?

I think that your point about non-existent pages is a good one. The minor issue I have with it is that when typing a page, I may not know if a page exists or not so will just put the link in in whatever form I think appropriate and may not go back and change them afterwards.

However, I felt that Zoran's issue was with existing pages, and in particular with renaming pages from accented to non-accented. Do you have any objection to an existing page having accents or symbols in the title?

One issue that I've just thought of is whether or not redirects come up in searches. Namely, if I search for 'Cech cohomology', I don't get the page '&Ccharon;ech cohomology' as a page (I get it as one of the pages containing this phrase).

(And I apologise for saying "no-one objects" when you had)

• CommentRowNumber31.
• CommentAuthorUrs
• CommentTimeMar 18th 2010
• (edited Mar 18th 2010)

One issue that I've just thought of is whether or not redirects come up in searches.

That seems to be an important point.

Our naming convention for instance for "infinity" in titles looks a bit awkward in the title, but has the huge advantage that one can google for "infinity category" and get the nLab results.

That's hugely useful, as anyone who ever tried to google for instance for material on $C^*$-algebra theory and things like that will know: the literature on $C^*$-algebras is effectively hidden for Google.

• CommentRowNumber32.
• CommentAuthorHarry Gindi
• CommentTimeMar 18th 2010

We should put together a petition to google to get them to implement some sort of math-friendly search.

• CommentRowNumber33.
• CommentAuthorTobyBartels
• CommentTimeMar 19th 2010

@ Andrew #30:

I would draw the line at the following point: can I, as a non-linguist, read the text?

That's a vague line, unless by ‘I’ you mean specifically Andrew Stacey, in which case we'll have to ask you. After all, I am not a linguist, but only an amateur lover of alphabets. But I can read ‘Владимир Воеводский’ just fine; it takes me about twice as long as it takes me to read ‘Vladimir Voevodsky’, even though I'm not fluent in any language that uses Cyrillic script. Zoran, of course, is fluent in some such languages and can also read ‘Владимир Воеводский’ just fine, I'd expect with no loss of speed. Perhaps you mean an average native speaker of English (which is the language of the Lab)? OK, but where is that line?

I notice that you say below ‘accents or symbols’ (emphasis added). So to clarify: You actually want to change the naming convention for all but the middle example in my comment above; is that right?

Do you have any objection to an existing page having accents or symbols in the title?

Yes, the objection is this one: It encourages people to make links to new pages with accents or symbols in the title, even though others will find this inconvenient and not do so, causing the risk of two pages being created, which is what naming conventions are meant to avoid. (Well, they also avoid arguments over page names, but we're such a small group that we just shift those to arguments over naming conventions (^_^), so that makes no difference.)

The work-around is to put the nice title at the top of the page, perhaps as the header of the table of contents if there is one. People have also suggested a &#x5B;[!title …]] command (analogous to &#x5B;[!redirects …]] and &#x5B;[!includes …]]), but that doesn't in fact exist now.

I know that this is a convoluted reason. Accordingly, it is a mild objection. (I haven't tried to change back the [[Čech …]] pages.) If I am alone here (as I seem to be now for accents, but perhaps not for symobls yet), then I agree that my objection loses.

(And I apologise for saying "no-one objects" when you had)

Well, I figured that you had just not noticed properly. (^_^)

• CommentRowNumber34.
• CommentAuthorTobyBartels
• CommentTimeMar 19th 2010

@ Harry #32:

I'm honestly surprised that Google doesn't at least index ‘ω’ (when it appears outside of a larger word) as ‘omega’ (at least in English-language pages). They recognise plenty of other grammatical equivalences, at least as complicated as that one.

• CommentRowNumber35.
• CommentAuthorHarry Gindi
• CommentTimeMar 19th 2010

Google doesn't even respect any unicode symbols even in parentheses. It's completely inadequate for anything technical.

• CommentRowNumber36.
• CommentAuthorAndrew Stacey
• CommentTimeMay 3rd 2010

We never really got this resolved, after Toby completely destroyed my lame attempt at a “third way”!

Someone proposed the addition of a [[!title: ]] tag. At first, I had some sympathy with the idea since I feel that having one thing do two jobs inevitably leads to the sort of problem we’re encountering here. However, I’m now thinking that the page name should be viewed as a [[!title: ]] tag and that encoding issues can be circumvented by appropriate use of redirects. To do this properly, the possible redirects for a page should be visible (somewhat similarly to the “included from” list) but that’s a much simpler feature-request than a whole new tag. So, my idea can be summed up as:

1. The canonical URL, aka “page name”, should be the ideal title for the page. There will, inevitably, be disputes about this and I think that the naming conventions should be viewed as guidelines for this - and as a way of resolving deadlocks - but I would also guess that the majority of pages can be named without dispute.
2. There should always be a “asciified” version in the redirects.
3. For non-latin alphabets, there should be a prominent header with the nearest latin equivalent. (This, again, is a bit subjective on when this would be required, but I would hope that if someone felt it was needed then no-one would object to its being there.)

This doesn’t address Toby’s argument about creating pages with or without non-ascii characters in names. The problem is that I don’t see a way to solve this issue, because it might not be the original author who puts the wikilinks in. Someone could write something in unicode, then someone else comes along later and thinks it should be a wikilink so puts square brackets around it. So unless we’re willing to give up unicode altogether, there’s always going to be a possibility of unicode titles.

(Yes, I know that I’m using the word “unicode” to mean “non-ascii” here)

I think that the real solution is that when someone creates a page then they should take a look at the list of “wanted pages” to see if anything there should redirect to the created page, and to see if any of the links listed there would make a better page title. This is not a technical solution, which makes it a little less cleaner! Perhaps we could link to the list of “wanted pages” from the edit page.

If I may address the original complaint, I’d also like to hope that those who do know what the right page title should be (accents and all) will not get too annoyed with those of us who don’t have a clue. I suspect that there are many names that ought to have accents that I write without them without knowing any better, and also I suspect that there are some names that I write with accents in the wrong places. I’m happy to admit my failings, but unless they are pointed out to me then I won’t know that they are failings to be admitted.

Regarding searches, this is a technology problem and so warrants a technology solution. If redirects aren’t indexed, then they should be, and if redirects aren’t searched internally, then they should be. These can be feature-requests on redirects. Especially given that redirects are still a fairly new feature, I think that Jacques would be particularly open to hearing our thoughts on how they can be used.

• CommentRowNumber37.
• CommentAuthorTobyBartels
• CommentTimeMay 3rd 2010

Andrew, I still don’t understand what you want, and I don’t understand what is different now from before. I come to clarify your idea, not to destroy it.

I would find it very helpful if you would state, for each of the five examples listed in #6, whether you advocate using the ‘proper’ orthography on the left or the ‘ASCIIfied’ orthography on the right. I guessed in #33 above that you would ASCIIfy only example (3), but you never confirmed that (or my previous guess in #29). Since the adjustment of Unicode in the Forum has mangled some of those examples, I copy them below (and number them for convenient reference):

1. ‘Eduard Čech’ → Eduard Cech
2. ‘Élie Cartan’ → Elie Cartan
3. ‘Владимир Воеводский’ → Vladimir Voevodsky
4. ‘∞-category’ → infinity-category
5. ‘Chern–Simons theory’ (with en dash) → Chern-Simons theory (with hyphen)

(The English Wikipedia, if anybody cares, would ASCIIfy examples 3&4.)

• CommentRowNumber38.
• CommentAuthorMike Shulman
• CommentTimeMay 3rd 2010

I definitely feel that example (3) should be “asciified,” since the nlab is written in English, and part of writing in English is using the latin alphabet. When English borrows words and names from other languages which also use the latin alphabet but with diacritics, it often keeps the diacritics, even though in “pure” English diacritics are not used, so I can see the argument for keeping those. At least, they don’t impede an English speaker from trying to pronounce the word, and they might help. But I would have no chance of pronouncing “Владимир Воеводский” if I didn’t already know what it said.

• CommentRowNumber39.
• CommentAuthorTobyBartels
• CommentTimeMay 3rd 2010

Andrew wrote:

I’m now thinking that the page name should be viewed as a [[!title: ]] tag

I use the h1 (single #) header at the top of the contents in place of a [[!title:…]] tag, where I violate the singular uncapitalised American ASCII convention in all four ways. But neither Urs’s nor Mike’s style of ToC allows this.

Someone could write something in [non-ascii], then someone else comes along later and thinks it should be a wikilink so puts square brackets around it.

Good point. I am slowly moving towards the view that the naming conventions are not really important, since people are good about adding lots of redirects now. But I still want somebody to stick up for the view that they do matter, which I did in #6.

I think that the real solution is that when someone creates a page then they should take a look at the list of “wanted pages” to see if anything there should redirect to the created page, and to see if any of the links listed there would make a better page title.

That list is huge; one has to load it in the browser, then search within the page (usually Ctrl-F) for keywords. At least with the current technology, we can’t really expect people to do that. (And I don’t do it myself; I just put in the redirects that I think are likely to come up.)

I’d also like to hope that those who do know what the right page title should be (accents and all) will not get too annoyed with those of us who don’t have a clue.

I would just silently move the page, just as I already edit the text.