WordPress comments post by Googlebot / “slike za facebook”?

July 18th, 2012

While you may have spotted Googlebot POST requests in your logs and dismissed them, you may also have spotted some odd requests and search phrases turning up in your traffic analysis. For me it was “slike za facebook”. Not only was it turning up in my search phrase stats, I could see my blog making a 200 response to the URLs in the Google search results. The content in the responses was from a directory in my document root called “/photographyyikb/”, which sure enough existed and had the following structure:

ls -a ../photographyyikb/
.  ..  .htaccess  index.php  .tmpbz

The .htaccess rewrites the incoming URL as a parameter to index.php, which contains some obfuscated code which (I’m guessing, didn’t look that hard) extracts ‘articles’ from what appears to be a pseudo filesystem built in the .tmpbz directory tree. I guess you’ll be able to get a good idea of what kind of content was in that filesystem by searching for the directory name on Google. I’m guessing it’s plain-old link farming … only using farmland that’s not your own, obviously.

If you want an archive of the files from the exploit, including the .htaccess, the index.php and the .tmpbz directory, there’s a 10MB tarball here. When I take unique-ish looking sequences of words from the sneaky content, I can see other sites appear to have been compromised in a similar way, although the URL varies slightly. The URL seems to be made up of the stem “photography”, “pics” or “photos” with a 4-character string appended (“yikb” in my case).

The exploit itself appears to be known. If I understand correctly it’s this one:

http://bot24.blogspot.co.uk/2012/07/xss-redirector-and-csrf-vulnerabilities.html

My blog did indeed have the Akismet setting “Auto-delete spam submitted on posts more than a month old” checked. If the POSTing Googlebot is bothering you very much, you could also try something from this page:

http://rankexploits.com/musings/2012/comment-control-for-worpdress-htacess-rules/

The POSTs to wp-comments-post seem to originate (for the ones that I’ve checked) Brazil, Turkey, Saudi Arabia and China. Some of the networks from which the traffic appears to originate have abuse/spam email addresses, some don’t. I sent out a few emails with highlights from my web log, but I suspect all I’ve accomplished is to get myself on some premium spam lists.

I’m not sure I haven’t put 2 and 2 together and come up with 7 here. I don’t have an explanation for the new content directory that links it firmly to the (not) Googlebot POSTs. They are certainly coincidental. I just put this online because there didn’t seem to be very many other people experiencing the same. I hope it helps.

Update: this seems to be a well-established exploit. This page (deliberately not linked, you’ll have to copy-paste) contains many similarly affected wordpress blogs:

http://www.tcvv.org/cgi-bin/autolink.cgi/www.billygamble.com/www.infobarrel.com/www.infobarrel.com/www.thehackingforum.com

Here’s a saved copy of the page, just in case: www.thehackingforum.com.html in zip (~1MB)

Update 2: Google have quarantined blog.lolyco.com and lolyco.com. Not sure what the detection method is – but it was preceded by a Webmastertools (I don’t often use it – I only looked because I saw the message in search results) message about a massive rise in 404s. The owners of /photographyyikb/ must be making links to lolyco.com available to Google’s crawler on other sites before they’re available on lolyco.com. Or perhaps they’re merely failing to synchronise their URLs on all their hijacked hosts. Not sure what to do with the list of 521 404 responses Google has given me – what use is that without the referring URLs?

BBC website down

July 12th, 2012
BBC's error 500 image

BBC’s error 500 image

My parents always told me never to attempt to profit from the misfortune of others. Mum, Dad, sorry. I’m a BBC news addict in much the way a former colleague once told me “I love red wine, but it doesn’t love me back”. I visit their site many, many times per day. Spelling, grammar, semantics, objectivity and morality aside, it always JustWorks™. Except for this evening. I attach a couple of screenshots for posterity. I have to say – even though I did at first snigger at the BBC’s Fail Whale – that I expect something a little more … I don’t know … artistic than MyFirstPhotoShop.gif. Maybe it’s the flames.

If you’re looking at the image and thinking “scary clown“, don’t. I have very fond childhood memories of Bubbles the Clown. After unexpectedly long excursions into those parts of YouTube where the best-rated comments are “how did I get here?” and I can hear birds singing in the dawn outside, I find myself wishing that Zen (my ISP) would present me with a test card at the right time.

Oh look, BBC’s back online again and I haven’t finished typing. Those screenshots, before I get some clicking-mindlessly-on-links therapy:

BBC's HTTP 500 page

BBC’s HTTP 500 page

Not sure what I clicked on to get this 410 response

Update: Sometime later I spotted this BBC article about the outage.

Update Friday 13th July: BBC is down again, just before 9am. The 410 Gone page is from the “Text only” link at the top of the page.

Cannot submit HTML form in mobile browser (Nokia/S40 “Response unknown”)

June 19th, 2012

A quick note, because this stole several hours of my life. I knocked up a little web application with a trivial web form which was CSS, cookie and javascript-free in the expectation that it would work even on elderly mobile phones, so long as they had a browser. It was properly “Web1.0” and worked a treat in my usual selection of browsers. Validator.w3.org said it was valid HTML. The page is a form with one or two (visible inputs) with hidden inputs set by the server-side code to maintain some state across several requests.

I copied the application up to a public host, ran it there and opened it up with my trusty Nokia 6500 Classic. It showed the first page in the series of 4, but then the submit input button – which should have submitted the form by post – didn’t work. The browser popped up an alert saying “Response unknown” (on the Nokia, the other phones I tried just ignored me). There were plenty of pages to read which seemed similar, but none which helped me find the answer. I ended up copy-pasting the dynamically generated page source into a file on my host and repeatedly edited and reloaded it on my phone until I found the problem.

It was my fault: I’m too used to modern browsers ‘helping’ me by guessing what I intended to happen when I wrote my bad page source. I had acquired a religious belief that the form element didn’t need an action attribute and omitting it would cause the form to be submitted to the URL from which it was requested. The old phones I tried (the Nokia and a throwing Samsung) would not submit a form without an action attribute. Their browsers were presumably designed in line with the HTML4 spec, which says that action is a REQUIRED attribute. HTML5 doesn’t say whether the action attribute is required or not, saying instead

The action and formaction content attributes, if specified, must have a value

(and then a couple of lines later, they emphasise the wrong word, “it” instead of “if”, just to test if you’re reading carefully)

The fix was straightforward: I just copied the request-URI into an action attribute for the form element and the pages now work a treat on the two old phones I have here. I hope that’s useful to you if you’re developing apps for yesterday’s User-Agents!

Tavistock playgrounds

June 17th, 2012

Was struggling to find playgrounds around Tavistock, so started a little project here:

Tavistock playgrounds

With some pins in a Google Map:


View Tavistock playgrounds in a larger map

We’re often the only ones in the playgrounds we visit, which is a bit odd. It hasn’t been the best of weather recently, so I hope we’ll have company when the sun comes out!

Alan Turing (UK’s Gay Dad of Computing) 100th birthday

May 9th, 2012
Alan Turing

Alan Turing

I wouldn’t have the career I love without Alan Turing; you wouldn’t have the Internet, and perhaps we wouldn’t now enjoy a comparatively free Europe. The UK chemically castrated Turing because he was gay. We drove him to commit suicide*. Gordon Brown said “sorry!” There was a (possibly poorly advised) petition to have him posthumously pardoned, and of tens of millions of Internet-and-computing-gadget enjoying Britons, only 23 thousand of us signed.

Alan Turing (had we not driven him to death) may have lived to have received a telegram from the Queen this year. It’s the Queen’s 60th Anniversary and nearly 60 years since Alan Turing died “on her watch”. Alan Turing’s treatment – the same treatment dished out to many of his contemporaries whose lives were no different to those of many decent people today – was barbaric. We owe him so much more than “sorry”.

I’d like to read next month that he was sent his anniversary message. He lives on through his contributions to us all and deserves a very happy 100th birthday.

* Jack Copeland says – quite reasonably – that the coroner’s suicide verdict is not strongly supported by evidence and that Turing’s death may have been simply an accident. That wouldn’t absolve the UK of our barbaric treatment of him.