Code golf at Stackoverflow. Byte array substring searching.

October 25th, 2009 | by Sean |

I don’t know what I was searching for when I found this ‘code golf’ challenge at Stack Overflow. The challenge was to find a substring of bytes in a byte array, returning the start position. It cost me half an hour or so, so I thought I’d post my attempt up for posterity. Many of the answers were aiming at presenting the code in the fewest characters possible, so that’s what I did too.

I didn’t do an in-depth survey, but it looked like a lot of the contributions were naïve string search, as is mine. Better search algorithms exist, so be warned! Here’s my 116 characters of Java:

int m(byte[]a,int i,int y,byte[]b,int j,int z)
{return i<y?j<z?a[i]==b[j++]?m(a,++i,y,b,j,z)
:m(a,0,y,b,j,z):-1:j-y;}

Not pretty! I have to confess to writing it with whitespace and if-else first, before stripping out the whitespace and converting if-else to (boolean expression)?statement-if-true:statement-if-false. One thing I noticed in an early attempt is that you can remove the whitespace between the ‘return’ statement and ‘-1’. Should you ever need to do that, you know who to thank!

Before you complain about the missing javadoc markup, the first byte array is the string to search for. The second byte array is the string to search in. The two ints following each byte array are the start index in each (typically zero – it saves me having to write another method to do that, and hence, precious characters in my golf!) and the length of the byte array – to save me using the lengthy ‘.length’ property! The method returns, as per spec, the position of the first matching substring or -1 if a match is not found.

Some of the really short contributions are from alleged programing languages with built in substring handling. You’ll forgive me if I sneer at those attempts and say they’re not really programming languages, they’re applications with complicated interfaces.

Post a Comment