Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
1 to 20 of 20
I would like to keep a local copy of the source of all articles of nLab.
The reasons for this are (a) the nLab is often down; (b) even if the nLab is up, the server often takes a lot of time to respond; (c) once the article is retrieved, the math takes a lot of time to render, especially for long articles, and it looks ugly.
To mitigate these problems, I would like to keep my own local copy of the nLab, compiled using TeX into a DVI file, which can then be viewed locally without any of the above problems, with all the hypertext infrastructure (e.g., the hyperlinks to other nLab articles) available.
The TeX files produced by the nLab server seem to have some problems, e.g., the double-bracketed links to other nLab articles are not processed at all, which forces me to process the source files myself, either by hacking the Instiki source or by writing my own converter to TeX.
The above approach requires one to maintain a local copy of the source of all articles. I consulted the wiki (http://ncatlab.org/nlab/show/HowTo#download) and the currently recommended approach seems to generate the list of all articles and then download each one manually.
This is feasible for the initial download, even though the /nlab/source/ files serve the source text with some HTML boilerplate (completely useless in my opinion), which must be manually removed (unlike the TeX files, which are served as text/plain), but totally inappropriate for keeping the database updated, because all pages (including /nlab/source/ ones) are dynamically generated, which prevents one from using HTTP mechanisms such as If-Modified-Since that would allow one to download only the modified pages, although even then one must issue an individual HTTP request for every single page on the nLab, which by itself takes a lot of time.
Much more efficient ways exist to synchronize directories with many files, for example, rsync, which can produce a list of modified parts of modified files and retrieve them in a few seconds. This also puts much less stress on the web server than the other methods, e.g., the one cited above.
Would it be too much to ask that rsync is enabled (e.g., using rsyncd) on the nLab web server (in a read-only mode, of course)?
There is a bzr repository, containing the full sources and all revisions of all nLab pages, that can probably be used for this purpose. I will give you some instructions here later today or tomorrow.
Great, many thanks! Any decent RCS should have efficiency close to that of rsync, I guess.
It turns out there was an old page here with instructions on how to do this. I’ve updated it, let me know if you have any problems following the instructions. (The seed file is actually still uploading at the moment, but it should be available in a couple hours.)
Excellent, many thanks for your help!
Everything works fine, bzr pull was successful.
It would generally be good if a few people keep full backups of the full nLab sources locally. As a safety measure. Should I disappear for good, I hope that after a while somebody with such a source will set up the nLab again.
Should I disappear for good
Do you mean this in the sense that you (like anyone else) might die on any given day, or do you mean that you are seriously considering disappearing from the nLab?
You’ve been missed recently.
Indeed, your message seems a bit worriesome. What is the current back-up policy? Is the nlab protected against a hard-disk crash?
I am not going to give up on the nLab voluntarily. But who knows what will happen. I suppose the server provider also provides backups, but what if we lose contact to them, either because they go out of business, or because I lose my life or my mind or, worse, my credit card.
As with DNA, the way to preserve volatile data through the millennia us to keep making distributed copies.
So, how big is it? It should be possible to find cloud back-up storage somewhere.
As a bzr repository the data is about 600 MB (about 500 MB archived).
full nLab sources
Full means that it contains also the data on history versions of the pages, or just the current full version ?
@12: it contains all the historical revisions, and all metadata.
Maybe backup to github? https://stackoverflow.com/questions/12019834/pushing-from-bazaar-to-github
Just syncing with, say, Spideroak is another option. They provide 2GB for free.
It’s bzr
only for historical reasons. It wouldn’t be impossible to convert it completely to git
and then to push to github. For security, the nLab should have its own github account (since it would need an SSH key with empty passphrase to do the push). Alternatively, there’s Launchpad which is the bazaar version of github.
The repository contains as much of the information as I could extract from the database without compromising personal data (ie web passwords). It is possible to reconstruct the nLab from the repository (and I have a script that does exactly that). Hmm, if the nLab had a github account then I could dump all these scripts that I’ve written over the years there as well which would be as good a way of handing them on as any.
Hi Bas,
thanks for the suggestions. Wanna go ahead and lend a hand?
I’d be happy to help a bit. Should we wait for the media wiki experiment? I’ve heard good things about mediawiki git integration. E.g. this seems simple enough.
http://tech.tiefpunkt.com/2013/01/pushing-mediawiki-powered-wikis-to-github/
Is the nlab server running linux?
Many things one could try to do eventually. But for the moment, do you think you could, using what has been discussed above in this thread here, write a script which regularly downloads the nLab bzr respository and stores a copy in one of the places which you have been suggesting?
Andrew (#15), that sounds like a good idea. And this script that reconstructs an Instiki installation from a bzr repository seems useful to have. I’ve created an “organization” account on GitHub: github.com/ncatlab. If you tell me your GitHub username I can invite you to the “team” and you can add the various useful scripts you have. (Or if you prefer, you can just e-mail them to me and I’ll add them.)
Bas (#14, 17), I am maintaining an up-to-date version of the bzr repository on two different computers. But I agree that pushing to GitHub would be a good idea as well. If I get a moment and you haven’t done it yet, then I’ll write a script to do that. Yes, the nLab server is running Linux.
1 to 20 of 20