ralph.ronnquist wrote:
The input buffer (buf) is clipped at 4096 characters, and this quite easily may result in the appearance of a composite UTF8 character being cut off at the end. Thus, since regex has its UTF8 glasses solidly stuck on, one can't apply it to buf.
Right. I'm taking raw bytes from the network layer, and trying to reassemble them into something usable at the application layer.
So I thought that, instead of immediately cleaning the parse result when extracting lines, one could let partial always be its last element, chopped off (or cleaning lines afterwards). That would keep to the same functionality and avoid using regex on buf.
Ah. So the problem wasn't some nasty channel having a bad topic; it was perhaps that the network layer broke the packet in the middle of a UTF-8 character. If there were people using the Asian encodings, the bug would still come up, of course.
EDIT 2: the more I study this the more I realize how wrong I were. As far as I can see, that particular error code only arises in quite particular cases: such as functions first and rest, and implicit string indexing. The latter would be the candidate cause here, except that the regex expressions don't have implicit indexing.
I guess my question is, why is regex assuming UTF8 string, when I didn't set the UTF8 option to it? And since it is set by default now, could there be a flag to disable it? When searching a string for \r and \n, I am in raw byte mode. IRC allows UTF8, but as a subset of raw byte streams.
Would specifying "0" as the option be enough to disable UTF8 parsing in regex and eliminate that error message?
Ralph, thanks for reading the source. I appreciate the review. Next I will try to add readline support. I think I'll make two apps; one for typing into, one for viewing output. Then you run them both inside tmux, each one inside its own window. :)
Cavemen in bearskins invaded the ivory towers of Artificial Intelligence. Nine months later, they left with a baby named newLISP. The women of the ivory towers wept and wailed. "Abomination!" they cried.