Want to take part in these discussions? Sign in if you have an account, or apply for one below
Vanilla 1.1.10 is a product of Lussumo. More Information: Documentation, Community Support.
For a published writer, Tess really needs to work on her online presence; no Google hits at all!
That's the only one where the spammer seems to have actually put in a link. But I'm reverting everything just to be safe.
Besides Eric's page, the home page of John's personal web is also a popular target this week. Perhaps we can add the sites being advertised there to the spam filter? Unfortunately, the words themselves are legitimate ones that we wouldn't want to block!
Our spammer has been a bit prolific. The full list of pages edited by that IP is:
+-----------------+--------+-----------------+
| name | web_id | author |
+-----------------+--------+-----------------+
| category theory | 1 | MickeyMouse |
| physics | 1 | SunDawn |
| Eric Forgy | 1 | WikiAdmin |
| Eric Forgy | 1 | TessWither |
| localization | 1 | ArthurCrowe |
| localization | 1 | TimSmith |
| SVG Sandbox | 1 | Alma Myers |
| HomePage | 14 | AnonymousCoward |
| TessWither | 1 | TessWither |
+-----------------+--------+-----------------+
Does that fit with your list, Toby?
@ David
I've blanked it and added it to category:spam.
@ Andrew
I don't have a list. The implicit list that I referred to in #2 as ‘everything’ is just the list attributed to TessWither. But I'll check those.
Those have all already been reverted.
How about the IP that made the edits attributed to TessWither but to pages not on your list: bicategory of relations, pentagon decagon hexagon identity, and saturated class of maps? That would be 119.111.124.194.
When you wrote HomePage up there, you mean John's, right? The second IP did one of those too. (Both also reverted, no worries.)
Yes, the HomePage was John's. That second IP modified the following pages:
+-----------------------------------+-----------------+---------------------+
| name | author | updated_at |
+-----------------------------------+-----------------+---------------------+
| HowTo | AnonymousCoward | 2009-10-22 08:40:06 |
| mathematics | Levi Steins | 2009-10-22 08:53:09 |
| Eric Forgy | WikiAdmin | 2009-11-26 03:01:38 |
| Eric Forgy | TessWither | 2009-11-26 09:48:08 |
| HomePage | AnonymousCoward | 2009-11-26 03:07:34 |
| HomePage | AnonymousCoward | 2009-11-27 01:46:47 |
| pentagon decagon hexagon identity | TessWither | 2009-11-26 03:06:00 |
| bicategory of relations | TessWither | 2009-11-26 03:07:23 |
| saturated class of maps | TessWither | 2009-11-26 03:06:48 |
+-----------------------------------+-----------------+---------------------+
Again, the HomePage is John's.
Ah, now. I've seen these IPs before. They're the Filipino spammers that I thought that Jacques had banned. I think I misunderstood his email when he told me that he'd blocked them - he meant that he'd blocked them on his instiki installation but I thought he'd blocked them within instiki. In hindsight, my mistake is obvious!
Okay, I'll follow Jacques' lead and block these as the IPs seem static.
Right, the block is on. We're now blocking those two specific IP addresses. I tested the method with my own IP and it gives the "You don't have permission to view that link" message.
OK, those have all already been reverted too.
HTTP 403? Seems appropriate.
Those ()&%(*^% just hit my Instiki installation for my course. What do they do? Search through the internet specifically for Instiki installations?
Can we get Jacques to add the relevant domain names to the spam blacklist?
The spam blacklist is handled by an external body. I think that the rationale is that a robot spamming instiki installations is quite likely to be spamming other things as well so gets themselves on this wider blacklist. This list is more comprehensive than one would be that was just based on instiki installations so we benefit by using it. However, it does mean that when we get specifically targeted then we need another line of defence - the method of attack may not qualify for adding to the global list. In particular, when (as these seem to be) the spam is done by a person then there's only a limited amount that can be done to prevent it.
Fortunately, these particular spammers are operating from a pair of fixed ip addresses so it's very simple to block them at the ip-level. If they start using dynamic hosts or proxies then we find another solution (perhaps by adding their links to the module that checks edits for dubious words and stuff).
If someone is absolutely determined to hack our installation then there's not a lot we can do about it. The game is always in a balance: we try to make it unattractive for them to attack our installation by firstly making it difficult and secondly by cleaning up quickly when they do. I suspect that on the line of "making it difficult" we don't have a lot of room to manoeuvre before it becomes noticeable for regular users (or even for normal users). But there is more we can do on the clean-up side. I know that you (Toby) are quite diligent at going through the "recently revised" looking at new edits, but we could add a layer of automatic notification to that; say a script that searches yesterday's edits for links off the nlab and presents (here?) a list of those so that they can be checked out first. Similar things to look for would be an edit by an established author but from a new IP, or an edit by a new (or newish) author. These could all be done asynchronously and so wouldn't interrupt the main flow of the lab work.
The blacklist that I was referring to is a content blacklist which I know that Jacques controls, since he removed ‘cialis’ from it (or rather, added a word edge there) so that I could write words like ‘specialise’.
I think that this may be what you meant by ‘the module that checks edits for dubious words and stuff’; if they come back, then we'll want to add their links to that.
The ideas in your third paragraph all sound good, but I don't know how to program them, so you'll have to tell us how feasible they really are.
Reasonably feasible, once I implement an idea I've had which is to export the database (or rather, the page revisions) to a version control system. I think I've mentioned this elsewhere as a way of doing backups on a web-by-web basis. It would also make it easy to search through revisions looking for warning signs asynchronously since it would be operating on a copy of the database, not the database itself.
I think I know all the pieces of how to do this, but I just need a short time when I can concentrate on putting them together. Maybe 2011? I may be due for a sabbatical that year ...
I noticed that Jacques had reverted some spam on the instiki instiki and checked the IP against our logs. That brought up quasicompact. I've edited away the spam (look at revision 10 if you're interested) essentially putting back what was in revision 9. The spam was another of those links to essay writers, but from a new IP.
Two more spam edits. One on physics and the other on Towards Higher Categories (johnbaez). Both inserting links to dissertation/paper writing services with some semi-sensible text around them.
There is a strange entry at AnodyneHoward. The entry is anodyne in the non-mathematical sense, but is not very useful!!!! Should it be blanked or left as it is?
It’s just someone who has made a few edits over the years up to last month. Seems fine to me.
I have given the page AnodyneHoward the line
category: people
to make it clearer that this here a person is introducing themselves.
Ideally, one could make the text more explicit by changing
Nobody, with no scholarship….
to something more explicit like
Signing edits with “AnodyneHoward”, I am nobody, with no scholarship….
But I won’t do this now. Maybe AnodyneHoward sees this here and has an idea what to do.
1 to 22 of 22