Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
I just wrote an addition to regular space, and I can't even tell if it got saved properly. There's no response from the Lab (just a long wait, no error message, until it times out), even though I've restarted it twice.
Incidentally, it's been at least a week since I've been able to get more than two characters into the terminal at a time. Even to restart it with ~/x
, I have to log in twice!
From wget
, I get ‘ERROR 504: Gateway Timeout.’.
/etc/init.d/instiki status
says ‘instiki dead but subsys locked’; anything besides status
gives an error that /usr/local/instiki/tmp/pids/server.pid
doesn't exist or that instiki
is already running. But ps -Al
gives nothing that looks relevant (except for lighttpd
).
Also, apparently the reason why I could never get more than two characters in is that typing ~
will always freeze it! Otherwise, the ssh
terminal is doing just fine. So cd; ./x
is probably best from now on.
When loaded from /etc/init.d/instiki then it creates a lockfile. The lockfile isn't created by instiki, but by the script in /etc/init.d. Thus if instiki crashes, the lockfile isn't removed and the script won't restart. So it's necessary to remove the lockfile before relaunching it.
rm /var/lock/subsys/instiki
But one should check that instiki isn't running before removing the lockfile (via ps auxc).
Oh, and some implementations of ssh use ~ as an escape character. To quote:
-e escape_char
Sets the escape character for sessions with a pty (default: ‘~’). The escape character is only recognized at the beginning of a line. The escape character followed by a dot (‘.’) closes the connection; followed by control-Z suspends the connection; and followed by itself sends the escape character once. Setting the character to “none” disables any escapes and makes the session fully transparent.
So you could avoid the problem by doing ssh -e '#'
or something like that.
Or just type ~
twice. Jeez, why didn't I ever try that? Even accidentally!
Thanks for /var/lock/subsys/instiki
; I knew that there was a lock file somewhere, but I never knew where it was.
We need some method to archive important results of discussions here. In five month the nLab may halt while I am working on it and I will have to try to remember where it was that I saw you two chat about lockfiles.
A good idea might be to split off from the nLab's HowTo page an AdminHowTo page that lists answers to issues like these.
It's a moot point, but I would put this sort of stuff on the 'nlabmeta' web. Most people don't need to know about this stuff, and most people couldn't do anything about it even if they did know it.
But you're right. We do need to archive some of the stuff here. Some of it does get done (like how to do redirects) but it's not as simple as just copying the solution to a problem across from here to there. For example, take the 'downloading the n-lab' thread. There's a useful script there that if someone wants to download the n-lab then they should know. However, if someone hasn't yet had the inclination to download the entire n-lab then I'd rather not put the idea into their heads as if everyone does it then there goes our bandwidth!
We probably have three categories of information:
Stuff everyone might find useful. Such as how to do redirects, or include SVGs.
Stuff that anyone could find useful, but isn't necessary for ordinary use of the n-lab. Such as downloading the whole lot.
Stuff that only the lab elves need to know. Such as how to reboot the server.
I'd recommend: n-lab HowTo for the first, n-labmeta HowTo for the second, and a lab elves technical page for the third.
I'm not saying that any information should be hidden or deliberately obscured, but that it is layered in such a fashion that the most likely and useful information is encountered first.
The Lab seems to be out of action. Perhaps it has a hang-over after celebrating too much last night!!! (For the visible record, today is Christmas day)
It works now for me.
I couldn’t get to it just now, so I restarted it (again?).
nLab seems down to me for a few hours (after a long time it returns a blank page for any request)
I restarted it again.
@Rod Try again. It gave me a blank to start with but is now working normally it seems.
I have just restarted it once more.
I noticed the last two days, the nLab had a multitude of down-times of a few minutes, from which it did recover – I believe these revoverys are due to Andrew’s recent modification which makes the instiki software restart on a regular basis of minutes or something like this
So I am wondering about two things:
even if it does recover, why is it down so many times per day?
why does the automatic recovery still fail some time?
Lab is down again.
It still seems to be down. Happy New Year everyone. 2011 has got here ’safely’.
4 minutes later) Its automatic reboot worked :-) It is back.
Actually, I just restarted it manually.
@Mike. Can’t be right always! so doubly Happy New Year to you.
Again down ? I can not access the nLab now (16:25 CET)
I have restarted it.
It seems to be down again.
Later: What is the situation about automatic restart since it has been out for at least 15 minutes now?
I have restarted the server.
Down again, and I’m not at a place where I can get in to the server. (I don’t have my key with me.)
Yeah, I was just about to say. Someone should host Jim’s recently uploaded article in a stable place for now and change the link at the cafe, as it reflects badly on the cafe not being able to satisfy links.
I have restarted the server.
it reflects badly on the cafe not being able to satisfy links.
I find that the smallest problem. What worries me is the Lab being down. And what really worries me is: the Journal-to-be not being up.
I wish I knew what we could do.
I have restarted it again.
long waits before if finally returns an empty page. Persisting for about 1 hour now.
I have restarted it now.
We’ve been getting some memory spikes recently, with the instiki processes going up to about 3 times their usual level. When I’ve spotted them, I’ve done a “soft reset” which has worked fine. When I get a bit of time, I’ll track down what’s causing them and report back to Jacques.
It’s down again…
I was about to restart it, but it is already back.
I can’t load nLab, including my personal pages. Edit: after few minutes succeeded. Edit later: veeery long loads continue. Edit even later: much better now, almost normal!
nLab does not load now (Feb 1, 15:24)
It has been off for some time. (Edit: it is back 14.39 UK)
I have now restarted it.
Down, 22:02:24 UTC 2011
restarted
It seems to be down again.
I have restarted it.
(Would have done it an hour ago, had I not had wild problems with my internet connection.)
Thanks
It might be a good idea if you (Tim) for instance also got access to the Lab server. Since we are back to the point where it needs restarting once of twice a day, it would be good to have more people be able to do so. If you are willing to help with this, I suppose Andrew would be glad to create an account for you. (You need a public key, though. In case you are not familiar with the trouble one has to go through for this, I can send you a step-by-step list for what to do. )
The steering committee should decide on this, but certainly I have no problems with Tim having reboot access to the nLab.
What I’d really like is for someone who knows a thing or two about what might be the problem to tell me what to look for!
Guess what …..
PS still down 6 hours later.
Phew! back at last (8 hours).
I was offline all weekend. Was sick.
Hope you get well soon.
Is the n-lab set for automatic reboot still?
Is the n-lab set for automatic reboot still?
I think it is, but for some reason the automatic reboot doesn’t always work. I suppose when it hangs it hangs so badly that it cannot do anything anymore, hence also not reboot itself.
Actually, the automatic reboot stuff doesn’t seem to work as I’d hoped it would. It seems that it goes in to the restart cycle, but the restart isn’t in place by the time it next checks, so it assumes that the restart didn’t wok and goes into a sulk. I need to investigate other systems.
Can’t we just set a cron job that calls the command which we call by hand every now and then?
The lab would seem to have gone down again.
I seem to have restarted it.
It is working now, thanks.
Down again.
I have restarted it.
Tim, should we try to give you access to the nLab server? Would you be willing to? Given the number of your lab-down-reports, it would be really good for the Lab community and for you, it seems, if you could restart the lab.
Lab seems not responsive at the moment. Edit: 10 minutes later: Lab is back.
I had restarted it, but before I saw your message here.
As long as myself I am working on the lab, I notice quickly when it goes down. Trouble begins when I am not myself working on it.
Edit: I could not get the response from Lab for last several minutes, but just now it is back again, but somewhat slow. New entry Maschke’s theorem.
Topically as far as another thread goes, the Lab seems to be down. Although this does not apply this time, I seemed to notice that it crashes at weekends. Is this true? Is it internal to the software (e.g. a routine scheduled garbage collection or something like that) which overloads it, or is there some external source such as someone downloading new material to have a personal copy.
Topically as far as another thread goes, the Lab seems to be down.
I have now restarted it.
Although this does not apply this time, I seemed to notice that it crashes at weekends. Is this true?
It crashes several times each day. But during the week I am usually online and notice it fairly quickly, and restart it without always dropping a note here. On weekends even I am offline for longer periods. I think that explains it.
But I will now try to record every single time that I restart the server, so that we get a better idea.
have restarted the Lab
It seems to have recovered without me doing anything.
have restarted the server
I have restarted the Lab.
But then something curious happened: the pages that I was waiting for to display did display the instant that my finger touched the enter-key to send off the restart command.
Maybe a coincidence. But it did happen before to me. So I thought I’d mention it.
generally, the lab is very slow this afternoon. That makes it hard to tell wheteher it’s down or just being lazy.
I have restarted it again
Happy Easter. But lab down. Apr 24, 11:22 CET.
It has been down since 6.30 BST at least. And Happy Easter to everyone.
Well, I do not like common holidays for the plain reason that most of the things do not function, are closed or are forbidden to do during them. This time is the Lab.
Well now the problems with Forum as well. It takes several minutes to refresh the preview of a post.
Some of that may be due to lots of people downloading megalength films somewhere between you and the main ’superhighway’. (After one preview: That was not too bad.)
No there is some other problem. Somehow I can not get the double dollar sign work consistently. When it does not like it, like usual
it runs it for minutes and returns at the end the page without he formula.
(Look at the source of this)
I have the same problem.
When the lab is down, then the rendering of maths here won’t work either. That’s because the actual conversion takes place on the same server as the nlab due to the fact that it needs something beyond what I’m allowed to run here.
I’ve just restarted the lab. My apologies for the long delay in restarting.
thanks, Andrew. It is good to know the reason.
Maybe Tim’s observation above was right after all: as far as I can see the lab was down consistently at the very end of the week, somewhere from Sunday to Monday during the last weeks, maybe longer.
I just realised that this post (which predates the Forum categories) is in the Atrium; I just moved it to Technical Matters.
I have restarted the server.
What were the symptoms that time? I’m intrigued because I happened to be logged in as root when you restarted it and so when I noticed that you restarted it, I checked the logs and couldn’t see any of the usual suspects.
I keep restarting it, but it does not come back at the moment.
What were the symptoms that time?
Same as always: after calling either show or save nothing happens for a minute of so, and then an error message appears.
Now it’s back.
It’s down again. I can’t even restart it, because even the command line is not reacting. (Had this before throughout the mrning. It has been immensely slow ever since two hours ago or so).
Now I have restarted it again. I have received one response of one page, but am already waiting again for the second page.
By the way, it is this kind of very frustrating experience that made me ask last time: what is our perspective? Do we need to stick with this?
If the commandline is also slow then that’s nothing to do with Instiki but is to do with the connection between your computer and the server.
In poking around just now I discovered that the daily updates weren’t getting applied due to an issue with a dependency. I’ve just fixed that.
If the commandline is also slow then that’s nothing to do with Instiki but is to do with the connection between your computer and the server.
Sure, but since at the same time my computer happily accesses all kinds of sites, it seems to indicate that the server that the Lab is running on is busy with something else.. Might be a hint as to what the problem is, maybe, I thought.
By the way, I have just restarted once more. This time calling any page produced a Passenger error message saying that Ruby on Rails could not be started.
Sure, but since at the same time my computer happily accesses all kinds of sites, it seems to indicate that the server that the nLab is running on is busy with something else.
Not necessarily. Remember that we’re using a VPS on some system, so there are several links in the chain between your computer and the nLab server, any one of which might be causing the slowdown.
it went really well the last days, but now I had to restart the Lab again.
Maybe the lab is secretly a republican (in the UK sense of the word, not the American).
It looks as if the Lab has been to a late night street party and has a hang over! It has not yet got up. Can someone turn on the alarm clock and wake it up. (In case non-Brits are unaware of the events in the UK yesterday, there was a royal wedding! As a further completely irrelevant fact, the newly weds live not far from me on Anglesey so there were ‘street parties’… in the streets! At least the weather was good.) End of irrelevant comments… can someone restart the Lab please.
restarted
Something struck me this morning. One reason why it might go down so often on Sundays might be because that is the day when the system does a full backup. When it does that, it locks the database so that no more information can be written to it. Due to the size of the database, the time that it is locked might be significant. And if instiki tries to access it while it is locked, then that might cause Something Bad.
I’ll investigate further.
I’ve had to restart the Lab
There was quite a time gap between 93 and 94…is it now a little more reliable ?
Not sure, but it seems to be down again at the moment.
I have restarted it.
It went down right again. I have restarted once more.
I had to restart the server again (5 min ago).
I just restarted the server. It required a hard restart.