<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Damerau-Levenshtein algorithm: Levenshtein with transpositions</title>
	<atom:link href="http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/</link>
	<description>Working nights with the thief of time</description>
	<lastBuildDate>Sun, 01 Jan 2012 17:42:51 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: sam j levy &#187; MySQL Levenshtein and Damerau-Levenshtein UDF&#8217;s</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-4017</link>
		<dc:creator>sam j levy &#187; MySQL Levenshtein and Damerau-Levenshtein UDF&#8217;s</dc:creator>
		<pubDate>Thu, 10 Mar 2011 19:17:09 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-4017</guid>
		<description>[...] Damerau-Levenshtein UDF  damlev.zip The Damerau-Levenshtein metric is a slightly modified version of the Levenshtein metric, a description of the differences can be found on Wikipedia and on Sean Collins&#8217; site. [...]</description>
		<content:encoded><![CDATA[<p>[...] Damerau-Levenshtein UDF  damlev.zip The Damerau-Levenshtein metric is a slightly modified version of the Levenshtein metric, a description of the differences can be found on Wikipedia and on Sean Collins&#8217; site. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kompilacja UDF dla mysql</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-3305</link>
		<dc:creator>kompilacja UDF dla mysql</dc:creator>
		<pubDate>Mon, 14 Jun 2010 08:01:45 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-3305</guid>
		<description>[...] parametrów i przykładów, ale ciągle mam problem. Wyjaśniam: mam funkcję napisaną w C (Damerau-Levenshtein), powinna być ładowana jako biblioteka dynamiczna do MySQL&#039;a. Tu jest problem, przy wykonaniu w [...]</description>
		<content:encoded><![CDATA[<p>[...] parametrów i przykładów, ale ciągle mam problem. Wyjaśniam: mam funkcję napisaną w C (Damerau-Levenshtein), powinna być ładowana jako biblioteka dynamiczna do MySQL&#39;a. Tu jest problem, przy wykonaniu w [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alexander</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-3110</link>
		<dc:creator>Alexander</dc:creator>
		<pubDate>Sat, 06 Feb 2010 21:46:29 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-3110</guid>
		<description>Sean, 

I believe that if we define Damerau-Levenshtein distance as &quot;minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a transposition of two characters&quot; (as it is done in Wikipedia, and also in this http://portal.acm.org/citation.cfm?id=356827.356830 article, where the notion seems to be introduced for the first time), then an algorithm calculating this distance should return 2 for (&quot;TO&quot;, &quot;OST&quot;). Otherwise the algorithm is calculating something else :)

But that doesn&#039;t actually matter. Sorry for the long comment, and thank you for your answer!</description>
		<content:encoded><![CDATA[<p>Sean, </p>
<p>I believe that if we define Damerau-Levenshtein distance as &#8220;minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a transposition of two characters&#8221; (as it is done in Wikipedia, and also in this <a href="http://portal.acm.org/citation.cfm?id=356827.356830" rel="nofollow">http://portal.acm.org/citation.cfm?id=356827.356830</a> article, where the notion seems to be introduced for the first time), then an algorithm calculating this distance should return 2 for (&#8220;TO&#8221;, &#8220;OST&#8221;). Otherwise the algorithm is calculating something else <img src='http://blog.lolyco.com/sean/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>But that doesn&#8217;t actually matter. Sorry for the long comment, and thank you for your answer!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-3106</link>
		<dc:creator>Sean</dc:creator>
		<pubDate>Fri, 05 Feb 2010 02:16:35 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-3106</guid>
		<description>Hello Alexander - thanks for your comment. Last time I updated the Java damlev functions (still haven&#039;t got round to doing the same for the MySQL ones), I noticed some counter-intuitive (but not actually &#039;wrong&#039;) results. In&lt;a href=&quot;http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance&quot; rel=&quot;nofollow&quot;&gt; Wikipedia&#039;s article on Damerau-Levenshtein&lt;/a&gt; the example you refer to is used to show a &#039;wrinkle&#039; in the algorithm:

&lt;blockquote&gt;&quot;the number of edit operations needed to make the strings equal under the condition that &lt;strong&gt;no substring is edited more than once&lt;/strong&gt;&quot;&lt;/blockquote&gt;

You and I both know the edit distance between &#039;TO&#039; and &#039;OST&#039; is 2 (swap T for O to get OT, insert S), but Damerau Levenshtein works on single operations on the original string. The &#039;OT&#039; is an intermediate result (a substring that has been previously edited). The wikipedia article does mention improved algorithms for what is described as &quot;a proper algorithm to calculate unrestricted Damerau–Levenshtein distance&quot;. I suspect if it gives you a different result to the original algorithm, it possibly isn&#039;t &#039;Damerau-Levenshtein&#039;! Maybe one day I&#039;ll get round to having a look at those other algorithms, but there&#039;s absolutely no chance in the next few months.

(You can see the result in question on &lt;a href=&quot;http://spider.my/damerau-levenshtein.html?s1=TO&amp;s2=OST&amp;lim=4&quot; rel=&quot;nofollow&quot;&gt;the damerau-levenshtein page at spider.my&lt;/a&gt;)</description>
		<content:encoded><![CDATA[<p>Hello Alexander &#8211; thanks for your comment. Last time I updated the Java damlev functions (still haven&#8217;t got round to doing the same for the MySQL ones), I noticed some counter-intuitive (but not actually &#8216;wrong&#8217;) results. In<a href="http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance" rel="nofollow"> Wikipedia&#8217;s article on Damerau-Levenshtein</a> the example you refer to is used to show a &#8216;wrinkle&#8217; in the algorithm:</p>
<blockquote><p>&#8220;the number of edit operations needed to make the strings equal under the condition that <strong>no substring is edited more than once</strong>&#8220;</p></blockquote>
<p>You and I both know the edit distance between &#8216;TO&#8217; and &#8216;OST&#8217; is 2 (swap T for O to get OT, insert S), but Damerau Levenshtein works on single operations on the original string. The &#8216;OT&#8217; is an intermediate result (a substring that has been previously edited). The wikipedia article does mention improved algorithms for what is described as &#8220;a proper algorithm to calculate unrestricted Damerau–Levenshtein distance&#8221;. I suspect if it gives you a different result to the original algorithm, it possibly isn&#8217;t &#8216;Damerau-Levenshtein&#8217;! Maybe one day I&#8217;ll get round to having a look at those other algorithms, but there&#8217;s absolutely no chance in the next few months.</p>
<p>(You can see the result in question on <a href="http://spider.my/damerau-levenshtein.html?s1=TO&#038;s2=OST&#038;lim=4" rel="nofollow">the damerau-levenshtein page at spider.my</a>)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alexander</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-3104</link>
		<dc:creator>Alexander</dc:creator>
		<pubDate>Thu, 04 Feb 2010 19:51:35 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-3104</guid>
		<description>Hi Sean, 

  Thanks for your implementations! 
  I noticed that damlev(&quot;TO&quot;,&quot;OST&quot;) returns 3, while it should be 2 in this case (the test actually is from Wikipedia article on Damlev distance).
  Do you think it can be fixed? 

Thanks!</description>
		<content:encoded><![CDATA[<p>Hi Sean, </p>
<p>  Thanks for your implementations!<br />
  I noticed that damlev(&#8220;TO&#8221;,&#8221;OST&#8221;) returns 3, while it should be 2 in this case (the test actually is from Wikipedia article on Damlev distance).<br />
  Do you think it can be fixed? </p>
<p>Thanks!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-2928</link>
		<dc:creator>Jon</dc:creator>
		<pubDate>Mon, 14 Sep 2009 17:17:03 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-2928</guid>
		<description>You&#039;re right, it doesn&#039;t show. I have the habit of trying to fix things that are not broken :)</description>
		<content:encoded><![CDATA[<p>You&#8217;re right, it doesn&#8217;t show. I have the habit of trying to fix things that are not broken <img src='http://blog.lolyco.com/sean/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-2927</link>
		<dc:creator>Sean</dc:creator>
		<pubDate>Mon, 14 Sep 2009 14:32:52 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-2927</guid>
		<description>Do you have the bug? I have no regression tests for the MySQL code at at the moment. I must get round to it at some point, but I&#039;ve got very little time at the moment. The bug was a trivial coding error in the Java Levenshtein methods in a part of the code that doesn&#039;t exactly mirror the C++ code in the MySQL functions. I just checked the damlevlim256 on a server here and it doesn&#039;t show the problem Bjorn reported.

If you can send me an example of a bad return value, I&#039;ll check it out as soon as I can.</description>
		<content:encoded><![CDATA[<p>Do you have the bug? I have no regression tests for the MySQL code at at the moment. I must get round to it at some point, but I&#8217;ve got very little time at the moment. The bug was a trivial coding error in the Java Levenshtein methods in a part of the code that doesn&#8217;t exactly mirror the C++ code in the MySQL functions. I just checked the damlevlim256 on a server here and it doesn&#8217;t show the problem Bjorn reported.</p>
<p>If you can send me an example of a bad return value, I&#8217;ll check it out as soon as I can.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-2926</link>
		<dc:creator>Jon</dc:creator>
		<pubDate>Mon, 14 Sep 2009 13:56:05 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-2926</guid>
		<description>Is the bug fixed in http://spider.my/static/contrib/damlev.zip? I see no differences with my copy of damlev256.cpp..</description>
		<content:encoded><![CDATA[<p>Is the bug fixed in <a href="http://spider.my/static/contrib/damlev.zip?" rel="nofollow">http://spider.my/static/contrib/damlev.zip?</a> I see no differences with my copy of damlev256.cpp..</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-2909</link>
		<dc:creator>Sean</dc:creator>
		<pubDate>Tue, 11 Aug 2009 10:32:11 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-2909</guid>
		<description>Hello Björn, thanks for reporting. I can confirm that error, I&#039;ll look into it this evening and let you know later
Sean

&lt;em&gt;--update 2009 Aug 15 New version with regression fixed (and added to static main method tests) uploaded same day. Forgot to update my comment here!&lt;/em&gt;</description>
		<content:encoded><![CDATA[<p>Hello Björn, thanks for reporting. I can confirm that error, I&#8217;ll look into it this evening and let you know later<br />
Sean</p>
<p><em>&#8211;update 2009 Aug 15 New version with regression fixed (and added to static main method tests) uploaded same day. Forgot to update my comment here!</em></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Björn Törnqvist</title>
		<link>http://blog.lolyco.com/sean/2008/08/27/damerau-levenshtein-algorithm-levenshtein-with-transpositions/comment-page-1/#comment-2908</link>
		<dc:creator>Björn Törnqvist</dc:creator>
		<pubDate>Tue, 11 Aug 2009 07:18:37 +0000</pubDate>
		<guid isPermaLink="false">http://blog.lolyco.com/sean/?p=116#comment-2908</guid>
		<description>Hi Sean,

Thanks for providing your Leveshtein methods.

I believe I found a bug with the damlevlim method. I think it&#039;s easiest described by the following example:

damlevlim(&quot;short&quot;,&quot;shoort&quot;,2);
Result: 2 (should be 1)

damlevlim(&quot;short&quot;,&quot;shrt&quot;,2);
Result: 2 (should be 1)

As you can see the second string has an additional character or is missing a character inside the string.

Can you take a look at it?

Best regards
/Björn Törnqvist</description>
		<content:encoded><![CDATA[<p>Hi Sean,</p>
<p>Thanks for providing your Leveshtein methods.</p>
<p>I believe I found a bug with the damlevlim method. I think it&#8217;s easiest described by the following example:</p>
<p>damlevlim(&#8220;short&#8221;,&#8221;shoort&#8221;,2);<br />
Result: 2 (should be 1)</p>
<p>damlevlim(&#8220;short&#8221;,&#8221;shrt&#8221;,2);<br />
Result: 2 (should be 1)</p>
<p>As you can see the second string has an additional character or is missing a character inside the string.</p>
<p>Can you take a look at it?</p>
<p>Best regards<br />
/Björn Törnqvist</p>
]]></content:encoded>
	</item>
</channel>
</rss>

