# Start a new discussion

## Not signed in

Want to take part in these discussions? Sign in if you have an account, or apply for one below

## Site Tag Cloud

Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.

• CommentRowNumber1.
• CommentAuthorAndrew Stacey
• CommentTimeJun 16th 2010
• (edited Jun 16th 2010)

In the discussion on Nishimura, Urs posted a link which got eaten up by the sanitiser. In the brief ensuing discussion, he seemed unsure of a way to reliably insert links both here and in the nLab. I’m none too sure of the details of what isn’t working, so if he or anyone else has examples of either syntax that ought to work but doesn’t (either here or in the nLab), or differences between the two places, please let me know. (I’ll be quite lax about how to interpret “ought to work”!)

My understanding was that the following were the basic rules:

1. If you post a link using actual XHTML syntax then it has to be completely valid. In particular, ampersands must be escaped (i.e. replaced by &amp;). If you don’t, then in the nLab, it will complain about invalid XHTML, here it will eat the url.
2. If you use Markdown syntax, you simply need to escape spaces and brackets (since both are special for Markdown and both mark the end of the URL). Markdown will handle the rest of the escaping.

I just tried it with what I think is the link that Urs tried, namely:

http://arxiv.org/find/grp_math/1/au:+Nishimura/0/1/0/all/0/1?skip=0&query_id=2c78a1a2f35a7b36


At the end of the Sandbox is the following code:

[arXiv on Nishimura](http://arxiv.org/find/grp_math/1/au:+Nishimura/0/1/0/all/0/1?skip=0&query_id=2c78a1a2f35a7b36)

<a href="http://arxiv.org/find/grp_math/1/au:+Nishimura/0/1/0/all/0/1?skip=0&amp;query_id=2c78a1a2f35a7b36">arXiv on Nishimura</a>

<a href="http://arxiv.org/find/grp_math/1/au:+Nishimura/0/1/0/all/0/1?skip=0&query_id=2c78a1a2f35a7b36">arXiv on Nishimura</a>


The first two work (note the escaped ampersand in the second), the third complains. If I do the same here, I get a complaint upon preview (about the third). Here are the first two:

arXiv on Nishimura

arXiv on Nishimura

So if anyone has examples of URLs that produce unexpected behaviour, please post them. The safest way to post them is to put then in code syntax: either with backticks, indented four spaces with blank lines fore and aft, or with the “fences” ~~~ on the previous and succeeding lines (see the source of this comment for examples). In particular, I don’t seem to be able to reproduce the behaviour whereby the validator actually eats the URL!

• CommentRowNumber2.
• CommentAuthorUrs
• CommentTimeJun 16th 2010

If you use Markdown syntax, you simply need to escape spaces and brackets

Ah, that’s probably the problem that I kept mentioning. I’ll let you know when I come across it next time.

• CommentRowNumber3.
• CommentAuthorAndrew Stacey
• CommentTimeJun 16th 2010

Thanks for the clarification.

Unfortunately, there is always going to be a need to escape something simply because at some point some program needs to know where the URL ends and the next chunk of stuff begins. So whatever delimits that ending will need to be escaped if it appears earlier. The problem is compounded by the fact that the URL that is displayed in the location bar at the top is not the URL that is in the page, or the URL that gets sent to the server. So when cutting-and-pasting, one has to take into account the different coding systems. I suppose it would be possible to write a javascript/cgi-script that took a url as input and output the encoded url, but given that brackets and spaces are comparatively rare in urls (compared to ampersands, at least), I would recommend the Markdown syntax as the safest one.

I’m still bamboozled by the vanishing url in your original comment, by the way.

• CommentRowNumber4.
• CommentAuthorUrs
• CommentTimeJun 16th 2010

but given that brackets and spaces are comparatively rare in urls

Well, in the URLs that I tend to link to a lot on the nLab, they are very common, (all the $(\infty,n)$-business) which is probably the reason why I kept having these problem.s

• CommentRowNumber5.
• CommentAuthorTim_van_Beek
• CommentTimeJun 16th 2010

I suppose it would be possible to write a javascript/cgi-script that took a url as input and output the encoded url…

Do you need a different encoding than the JavaScript escape funtion offers? See escaper.

• CommentRowNumber6.
• CommentAuthorAndrew Stacey
• CommentTimeJun 16th 2010

I think that would suffice. I just tried the following as “smart bookmarks”:

javascript:alert(escape(document.getSelection()))


The first requires that you select the url from the page, but it’s unlikely to be visible. The second escapes too much as it also escapes the colon at the beginning of the href. But it gives me hope that it would be fairly simple to do.

• CommentRowNumber7.
• CommentAuthorAndrew Stacey
• CommentTimeJun 16th 2010

Hmm, I’m running into problems with double-escaping in URLs. So here’s a simple one for nLab pages:

javascript:alert('http://ncatlab.org/nlab/show/'+escape(document.getSelection()))


The point is to select the link text somewhere in the page, then the above will convert it to the relevant escaped URL. In particular, if you are on the page in question then you can simply select the title. That doesn’t help with links elsewhere, though! Plus, for links to nLab pages either here or in the nLab itself, one ought to use wikilinks if possible.

Ah, but if you’re typing the link in to a text box somewhere then you can simply select what you’ve just typed and run the first bookmark from my previous comment on the selected text.

A more sophisticated version would actually replace the selected text by the escaped version.

• CommentRowNumber8.
• CommentAuthorTim_van_Beek
• CommentTimeJun 16th 2010

If you really need something customized you will find tons of examples in the web, escaping is one of the basic gymnastics :-)

• CommentRowNumber9.
• CommentAuthorIan_Durham
• CommentTimeJun 18th 2010

Thanks for clarifying this. I had a problem recently with that, but switching from XHTML to Markdown fixed my problem which means I probably had an ampersand in there somewhere (it was a very long URL).