Clues to the TBH ‘black blog’ author

July 25th, 2009
Hex dump of image file

Hex dump of image file

I was reading Lim Kit Siang’s blog recently and saw the article about the Teoh Beng Hock ‘black blog’. LKS is calling it a ‘black blog’ because it makes some serious allegations about wrong-doing in Selangor. The blog author has posted some image files which they say are copies of documents that were once in the possession of Teoh Beng Hock.

The author doesn’t post their identity, but sometimes clues can be found in content. My first attempt was to use the fabulous ImageMagick command line utilities to list the file properties for the images. Image files can contain ‘metadata’ besides the data that actually comprises the image itself. Metadata is information about the file, and can sometimes include author identification and software used. Many cameras write their make and model, and sometimes settings into image files.

ImageMagick didn’t provide very much of interest at all. Just out of interest, I opened the file with a text editor and saw a ‘Photoshop 3.0’ label in the text which was not printed by ImageMagick. That prompted me to write a hex dump page for poditronic.com, as I doubt many people will feel happy about opening strange files with the ‘wrong’ application.

I don’t have Photoshop, and the GIMP didn’t reveal the Photoshop metadata either. Fortunately, some information is available online about the Photoshop metadata format. If you use the hexdump page to examine the image of the documents posted at t4tbh.blogspot.com, you’ll see Picasa metadata first (is that added by blogspot, or was it the result of a deliberate use of Picasa by the author?), and then the Photoshop 3.0 label. Using the format information, You can read the values of the bytes for yourself, and confirm what I was able to discover: that the Photoshop metadata in the image is … empty.

I’m sorry if the title is misleading. Besides ‘Photoshop 3.0’, I don’t have a clue. There are all sorts of other interesting images online though, some of them containing possibly more data than their authors realise. Happy fishing!

File contents hex dump at poditronic.com

July 25th, 2009
Hex dump of image file

Hex dump of image file

Just added a hex dump page at poditronic.com – very similar code to the “View Server HTTP headers” page, but could be useful to some. The page currently loads just the first 512 bytes of the content at some URL, and present the bytes in a table. Printable bytes are displayed as-is, non-printable are displayed as a hex byte. I wrote the page because I wanted to explain to someone what meta-data is stored with an image. A web page is a big improvement in accessibility over asking someone to open a terminal and type in commands!

I had to remove the Referer field because one server we wanted to look at images on was returning 404 Not Found. I imagine this is a form of hotlinking protection. Sending a request to the server without the Referer field resulted in the image being returned as expected. An obvious future enhancement would be to provide a toggle for Referer field, and perhaps also an option to change the number of bytes to display.

It’s interesting to see what metadata is left in images on the web – happy fishing!

Browser request headers page at poditronic.com

July 21st, 2009
View browser request header fields at poditronic.com

View browser request header fields at poditronic.com

Another fairly easy task at poditronic.com: displaying the headers received by the Spinneret web server. This might be a useful tool if you want to know exactly what it is that your browser is sending to web servers when it requests a page. This first attempt does have one wrinkle: it shows the header fields as they are received from the Apache web server that’s doing vhosting and reverse proxying on my network. Check your browser request headers here.

What is my IP Address at poditronic.com

July 21st, 2009
Poditronic WAN IP address check

Poditronic WAN IP address check

I’ve just added a couple of IP address checking pages at poditronic.com, just as examples of how easy it is to do this with Spinneret. There’s a human readable WAN IP address checking page, and one that produces a plain-text (no markup) IP address check that should be ideal for writing scripts against.

I’m still hitting bugs in the utility classes I’m using for producing the pages, so I’m going to delay making Spinneret public for a little longer. While you could use Spinneret without the utility classes, they make a big difference to how fast you can produce working pages.

poditronic.com online – served by Spinneret

July 20th, 2009
Poditronic - first screenshot!

Poditronic - first screenshot!

I’m slowly inching towards a first release of Spinneret. I’ll be testing some of the features out on a new domain: poditronic.com. The source and API will probably go up on spider.my eventually, but as functionality allows, I’ll put some tutorial-like content up on poditronic.com.

Not a very interesting post, I know, but I thought it was important to mark the occasion. Wah 4:20am. Time for bed!