Archive for the ‘Spider.my’ Category

What not to GET. Limiting what robots will request.

Monday, September 8th, 2008

I tested the spider at spider.my a few times recently. It was previously restricted to just a few sites that their respective admins had kindly volunteered. One of the immediate problems I noticed with releasing the spider in the wild was the number of pages I was mangling between storage ...

Damerau-Levenshtein algorithm: Levenshtein with transpositions

Wednesday, August 27th, 2008

I'm still working away slowly at Spider.my, and spotted a funny loop in the search suggestions: [caption id="attachment_117" align="alignnone" width="300" caption="Search for Teusday - how about Thursday?"][/caption]The helpful hint is "maybe 'thursday' would get more results?". I'm using a simple Levenshtein distance algorithm to provide hints when only a few results ...

Writing a search engine widget for firefox / IE7

Friday, August 15th, 2008

I had been wondering how difficult it would be to make a search engine widget for spider.my - there's so much important stuff yet to be done, but I got distracted by the gloss! The great news is - very easy! You can visit spider.my now and add its search ...

Search suggestions with MYSQL Levenshtein distance

Wednesday, August 6th, 2008

Up until a couple of days ago, the Spider.my search page applied only minimal formatting to hits, and no extra page decoration. The 'blank screen of death' appeared for a search miss: no results, no page content. Just to reassure the user the empty page was 'normal', I'd put a lame ...

Like Cuil. But with no funding, no style and no data.

Saturday, August 2nd, 2008

[caption id="attachment_53" align="alignright" width="160" caption=" - Mandelbrot"][/caption] I've been working so late recently, I've been early. It's silly, working late. I've been sitting here working at a glacial pace, thinking "I must go to sleep", until I can hear birds singing, the window starts to get bright again, and ...