Damerau Levenshtein

I hope a Wordpress page will be a better place to keep latest and greatest version of software than individual blog posts are. Maybe one day I’ll get round to starting a Sourceforge project or something!

Damerau-Levenshtein is an edit-distance / optimal alignment function. See the wikipedia page on Damerau-Levenshtein for background. It is similar to the Levenshtein function but treats transpositions (swapped letters) as one mistake instead of two:

levenshtein('hello', 'hlelo') = 2
damlev('hello', 'hlelo') = 1

Latest version 2009Nov30 is almost a complete rewrite of the Java code. You can test the functions online at spider.my

As long as spider.my is up (another project, another endless list of bugs), you should be able to download the latest versions from there:

Damerau Levenshtein Java methods

Damerau Levenshtein MySQL UDF C++ source code (in a ZIP archive)

Some articles I wrote about the code:

Fix of a bug reported by Björn (thank you for reporting)

A fix and a note that spider.my is using the Java version now, instead of the MySQL UDF

The original Damerau Levenshtein article describing differences between the UDFs

Where I started – a need to provide spelling suggestions (includes compiling instructions for the MySQL UDFs)