read-line optimisation

Q&A's, tips, howto's
Post Reply
Astrobe
Posts: 39
Joined: Mon Jan 11, 2010 9:41 pm

read-line optimisation

Post by Astrobe »

Hello,

While programming a little tool that analyses some log file, I've noticed that read-line is a bit slower than one would expect. Looking at its code, it appeared to me reading the stream char-by-char could be the cause. I Modified it to use fgets instead:

Code: Select all

char * readStreamLine(STREAM * stream, FILE * inStream)
{
char buf[MAX_STRING];
size_t l;

openStrStream(stream, MAX_STRING, 1);

if(fgets(buf, MAX_STRING, inStream)!=NULL)
{
	l=strlen(buf);
	if(buf[l-1]==0x0A)
        {
             buf[--l]=0;
             if(buf[l-1]==0x0D) buf[--l]=0;
        }
	writeStreamStr(stream, buf, l);
	return(stream->buffer);
}
else
{
	if(feof(inStream)) clearerr(inStream);
	return NULL;
}
}
(this hasn't been heavily tested)

However, it doesn't strictly respects the original semantics of read-line with regards to newline characters. To be honest, I don't understand why they are that way, in particular why there is a requirement that a newline at the end of the file has to be erased.

Also, the part about the TRU64 is missing. I don't know if fgets handles EINTR correctly by itself on this platform. I've worked with systems plagued with a similar illness before, and unfortunately the FILE library (which was not standard IIRC) didn't handle very it well.

On the performance side, timings drop from 250ms to 50ms.

Lutz
Posts: 5288
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: read-line optimisation

Post by Lutz »

Thanks Astrobe. At the moment I don't recall why read-line was coded using fgetc() and not fgets(), but in the past a lot of problems occurred with read-line, using it on different OSs and for CGI in conjunction with different client web browsers on the web and using sockets on Unix as file handles and also when using pipes. So this change will need a lot of testing, but the speed improvement is certainly worth it.

Lutz
Posts: 5288
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: read-line optimisation

Post by Lutz »

I just realize that your version limits the line length to MAX_STRING, readStreamLine() should be able too read any line length.

Lutz
Posts: 5288
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: read-line optimisation

Post by Lutz »

Seems to pass all tests:

http://www.newlisp.org/downloads/develo ... nprogress/

Linux, Windows, OSX and FreeBSD seem to be fine. Gains are biggest on Linux and on longer lines than usually found in text files. For TRU64 the old method has been left, as I cannot test it.

Post Reply