Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
If I search for ’global sections’ in Google, then I get a hit which is a link http://ncatlab.org/nlab/new/global+section.
Has this been seen before? Anything we can do?
Poking around I can bring up the result http://ncatlab.org/nlab/show/global%20section. A bit odd..
That's dangerous; somebody could overwrite the article by accident. The robots meta tag on the new page clearly says not to index it, so this is a mistake on Google's part.
Google has clearly picked up the link from somewhere but I’m unclear as to whether or not it has followed it. Underneath the search hit I get:
A description for this result is not available because of this site’s robots.txt – learn more.
After clicking on “learn more” I found this paragraph:
While Google won’t crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.
So it would appear that there is a link to that URL somewhere. Or was, as I can’t find it via the link:<url>
method on Google.
I’ve seen results like that a few times in the last week. I think the best thing to do is to modify instiki so that /new/ redirects to /edit/ for pre-existing pages.
Yes, this would be a good change regardless of what Google does.
Jacques has implemented this: new/existing+page
redirects to edit/existing+page
.
Great! Now we just have to convince Google to link to /show/
instead of /new/
…
I’ve just had this reported to me by email. Even with the redirect, it’s a bit annoying as it still takes someone to the edit
page instead of the show
page.
One possibility would be to put in a redirect at the web server level so that if someone comes to the nlab from google then they always go to the show
page and never to the edit
or new
page.
Is it just google that is doing this? I still don’t know where they are getting the links from. I guess that we have links of the form new/page+that+doesn't+exist+yet
and it might be that google caches these links but doesn’t notice when they get fulfilled. Actually, that’s not a daft hypothesis. Since Google can’t look at the page (due to the robots.txt
rules), all it knows about is the link. Maybe it doesn’t save the page it got the link from so never notices when the link is (effectively) removed.
I’m trialling the redirect I mention above. What ought to happen is that a page request for a /new/
on the nlab (not on any other webs) will get redirected to the /show/
directive. This might further get directed to a /new/
if the page doesn’t exist, but at least that will only happen if the page doesn’t exist. This ought to only happen if the request was referred from a Google site.
When I go back up to David's original post, it's still sending me to edit
and not to show
. Did you make sure to get country-specific versions of Google? (David was using google.com.au
.)
Actually, I'm still getting edit
even from I come from google.com
.
This search https://www.google.com.au/search?q=bill+lawvere gives me results (after wikipedia, Bill’s home page and a video):
in that order, with descriptions as shown.
Toby, what browser are you using?
It works for me on Mac OS X using Firefox, Safari, or Chrome. It doesn’t work for me on Safari on an iPad. The problem is that some browsers do not set the HTTP_REFERER
field correctly and that’s what I’m using to test against.
David, there’s absolutely nothing I can do about the results that google returns. It’s what happens when someone clicks on one of those links that we’re working with.
The redirection from new
to edit
if a page exists is simple enough to change from new
to show
. Would this make a more sensible redirect? I would think so, since if you didn’t know that a page existed and clicked on new/existing+page
then you really ought to look at the page properly before editing it.
Would this make a more sensible redirect?
Yes.
Sorry, bit of a stupid moment there. Was thinking you’d found some workaround to make the spiders ignore the links to the ’new’ page.
I think redirecting to ’show’ is the most sensible thing. It’s not exactly good UX to have people click on the first nlab link in Google and it takes them to an edit page.
Okay, I’ve edited instiki so that the redirect goes to show
instead of edit
.
Toby, what browser are you using?
I believe that I was using up-to-date Firefox on Ubuntu 12.04, but I suppose that it's moot now.
1 to 17 of 17