Thanks for the explanation Lutz. I was trying to duplicate the error with version 10.6.2, but it didn't show up:
> (set 'foo (string "abcd" (pack "b" (int "0b11001111"))))
"abcd�"
> (regex "(\r|\n)$" foo 0)
nil
> (set 'foo (string "abcd" (pack "b" (int "0b11101111")) "e"))
"abcd�e"
> (regex "(\r|\n)$" foo 0)
nil
> (regex "(\r|\n)$" foo)
nil
Then I thought: "implicit indexing, ah"
> (regex "\r|\n" (foo -1))
nil
> (set 'foo (string "abcd" (pack "b" (int "0b11001111"))))
"abcd�"
> (regex "\r|\n" (foo -1))
nil
So still not sure exactly how my code triggered the exception. I'd like to duplicate the bug so I can fix it in my code.
Update
You mentioned a string containing a 0 byte, so I'll test that. And, still not triggering the exception.
> (set 'foo (string "abcd" (pack "b" 0) "e"))
"abcde"
> (regex "\r|\n" (foo -1))
nil
> (regex "\r|\n" foo)
nil
> (set 'foo (string "abc\rd" (pack "b" 0) "e\n"))
"abc\rde\n"
> (regex "\r|\n" foo)
("\r" 3 1)
> (regex "\n" foo)
("\n" 6 1)
OpenBSD
OpenBSD recently added support for Lua patterns to their web server; I read the manpage. The patterns are almost like regular expressions, but smaller, simple, very fast to implement, and include some nice things like paren-matching. 700 lines of code.
http://www.openbsd.org/cgi-bin/man.cgi/ ... y=patterns
http://comments.gmane.org/gmane.os.openbsd.tech/42569
there is some great interest in getting support for rewrites and
better matching in httpd. I refused to implement this using regex, as
regex is extremely complicated code, there have been lots of bugs,
they allow, if not specified carefully, dangerous recursions and
ReDOS, and I would add another potential attack surface in httpd.
Thanks to tedu <at> 's hint at BSDCan, I stumbled across Lua's pattern
matching implementation. It is relatively small (less than 700loc),
powerful, portable C code, MIT-licensed, and doesn't suffer from some
of regex' problems (eg., it doesn't allow recursive captures). I
ported it on my flight back from Ottawa, KNF'ed it, and turned it into
a C API without the Lua bindings. No, this diff does not bring the
Lua language to httpd!
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence. Nine months later, they left with a baby named newLISP. The women of the ivory towers wept and wailed. "Abomination!" they cried.