Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
Acting on a hunch, I deleted the cache version of the “all pages” and then loaded it whilst watching the memory usage. It zooms up to 300Mb (normal operating memory for instiki is under 100Mb) and stays there. That it stays there is hopefully fixable, but that it zooms up is probably a side-effect of the size of the nLab: anything that involves effectively querying every single page is going to take time and memory to do.
However, if we can figure out what useful information is contained on that page, there may be other ways of generating it which don’t then tie up the instiki processes for everyone else. In particular, since I get a copy of the data each day, I can run a script on that to produce daily updates of whatever information people use the “All pages” for.
So, some questions:
What information are people looking for on the “All pages” list (and, I guess, the “Recently revised” list)?
How “immediate” would they realistically like that information to be?
I never use “All pages”. If I want to know which entries exist, I use Google.
I do use “Recently Revised” regularly, though. To check if there are edits by people who don’t report them on the Forum.
On that latter point, my guess would be that if the list was regenerated daily then that would suffice, is that correct?
is that correct?
Yes!
I used to check All Pages for pages (or links to pages, on the right-hand side) whose names violated the naming conventions. But even sticking to names that begin with a capital letter, it’s too long for that now. I still occasionally use it to check for orphaned pages, however.
I sometimes check Recently Revised, when I know that somebody else has been editing the same pages as I’ve been (or been interested in), and I want to see at a glance if anything new has been done. For this, I would like it to be complete up to the minute.
I use these pages within specific categories much more often, to see everything in the category, but since results are much faster then, I assume that this is not a strain.
I seem to be getting a reputation for Spam detection. :-) My method is to check the pages named A. C. <-guess who I mean. I use recently revised for this, but once a day would do.
Recently Revised is useful when someone says they are going to fix or revise a page to check if something has been done yet.
For this purpose it might be useful to break down the list into say 3 different ones:
Recently revised in last week.
Recently revised in last month.
Revised since the beginning of time.
The first two should be up to the minute current, while the last honking big one could be generated on a daily (or weekly) basis - with a note saying to check the other two lists for the latest results.
Some comments on my usage; it may contain some observations not said above. I find All pages useful within my personal web as I forget the names of no-concept pages, e.g. pages related to specific task, like being a title of a lecture of mine (and google would not help me here). On the main web I use more often Recently revised page. The timing how old recently revised one wants is relative: in personal web, if one was away from working on it for 3 months, then the most recent entry is 3 months old. So if filtering by how old I would measure it in 100-s of items rather than in absolute time. Actually the way recently revised is now is pretty good for most purposes. All pages on the other hand – one usually wants the existing pages, not so often the links to the wanted pages.
"All Pages" would also list all the wanted pages, the absent targets of wiki links, which can be handy if you want to see where there are important gaps lingering; there should be (probably is...) some way to list these in page-sized chunks, I think?
Why should loading either one require querying the entire database at once? Wouldn’t it make more sense to store the current version of each somewhere and update it incrementally as other pages are changed? Obviously that isn’t something that will be changed soon, but I’m just curious.
One quick note in reply to Zoran: this is specific to the nLab and not on the personal webs, none of those is anywhere near the size of the nLab so doesn’t (as far as I’m aware) exhibit the same memory-eating behaviour.
Mike, I think that that would only work for “Recently Revised”, and even then it wouldn’t work with RR as it currently is since each page only appears once, so when a page gets updated it has to be removed from further down the list. However, that’s effectively what I’m proposing.
Toby, I think that it’s only the Big Lists that cause the problem, the category-specific ones are fine.
Assuming that no one weighs in with a strong reason for up-to-date listings, the easiest is to generate these pages from the bzr repository since that would mean that it didn’t involve the database at all and so was working completely independently of the actual nLab. That would mean that the resolution would be “per day”.
Assuming that no one weighs in with a strong reason for up-to-date listings,
Several people have weighed in on reasons for up-to-the-minute listings for Recently Revised. All of these could be satisfied with a n-RecentlyRevised or t-RecentlyRevised list which is limited to the most recent n page revisions or most recent page revisions over the preceding time period t.
The values for n or t could be some from some fixed sets or a Short Recently Revised page could have these values settable in a box at the top if a default like 1 week is not what is wanted.
Not quite. No-one’s said that they need to have absolutely latest-minute information about who edited what.
(I wouldn’t be adverse to making a feature request to make “Recently Revised” really mean “Recently”, say within the last week or month. But right now, I’m talking about working with the software as-is rather than with regard to any desired feature requests.)
I wouldn’t be adverse to making a feature request to make “Recently Revised” really mean “Recently”
Again, hopefully this would not apply to personal webs, where one month is a too short time to stop.
I think that that would only work for “Recently Revised”, and even then it wouldn’t work with RR as it currently is since each page only appears once, so when a page gets updated it has to be removed from further down the list.
I don’t understand. Couldn’t removing the page from further down the list also be done incrementally whenever the page is changed? And why wouldn’t it work for “All Pages”? Doesn’t that list only need to get updated when a new page is created? (Well, I guess the list of wanted-links needs to be updated any time a page is modified, but couldn’t that be done at the time each page is modified?)
Also, do RSS feeds perform the same intensive query that RR does? Would the same fix be applied to them?
No-one’s said that they need to have absolutely latest-minute information about who edited what.
That’s a strong word. I don’t need to have any Recently Revised or All Pages whatsoever, including categories and personal webs. I doubt that anybody else needs them either.
However, I would like to have, on demand, absolutely latest-minute (latest-second is not so important) information about who edited what when. I don’t use this much, but sometimes I do.
All Pages is a much bigger hog than Recently Revised, isn’t it? (Certainly the time delay is bigger.)
Zoran: Absolutely, this is just about the nLab.
Toby: True, “need” is stronger than I meant.
I use Recently revised, but not All pages.
When I was actively working on the nLab, I would literally check Recently Revised every 10-15 minutes and was frustrated with how slowly the RSS updated. Although I’m not doing that anymore, it is easy to imagine an eager new Lab Elf.
All Pages is a different story. Updates once every week or so would be fine.
Right, I’ve redirected “All Pages” to a temporary notice as it’s been really slowing the lab down ridiculously. I’ll set up a list that is updated daily. The lists “by category” and lists for the other webs should be working just as usual.
I’ve redirected “All Pages” to a temporary notice as it’s been really slowing the lab down ridiculously.
Good, thanks. We should have thought of that earlier.
When calling zoranskoda:All Pages in my personal nLab I notice that some of the items listed on the list of “wanted pages” are false alarm. For example, in the page funktor (zoranskoda) I have a link to Functors (joyalscatlab) to Joyal’s Catlab which displays and links correctly. However, the allpages list lists “Functors” as wanted item called from page funktor (zoranskoda). Similarly, at the page hom10connDmod (zoranskoda) I have the link Alexander Beilinson (nlab) which displays and links correctly to the main lab; however allpages for zoranskoda lab lists “Alexander Beilinson” wanted at zoranskoda:hom10connDmod. These pages existed at the time when the links were made. Many other links which work correctly and are the links to other labs do not appear on the allpages wanted list, just some.
This looks like something that needs to be sorted out in instiki itself so should be reported to Jacques.
In fact, I think, it is possible that when I first time created the link, that in that moment I did not write the redirect to the foreign lab properly, and corrected this later; however once wanted it is forever in the database of wanted, even after the link is corrected to proper name. This is my rough guess.
Okay, so maybe we should do a little experimentation to work out the conditions when it happens before reporting it.
@ Zoran #22:
That bug’s been around forever. I don’t think that I ever got around to pointing it out to Jacques.
As far as I’ve noticed, it happens whenever one web A links to another web B using a page name that doesn’t appear on A. First linking to the page on web A is definitely NOT a necessary condition.
1 to 26 of 26