Parsing file gives ERR: string token too long :

Q&A's, tips, howto's
Locked
joejoe
Posts: 173
Joined: Thu Jun 25, 2009 5:09 pm
Location: Denver, USA

Parsing file gives ERR: string token too long :

Post by joejoe »

I cant figure out what I should be doing differently to be able to parse a file that apparently has a looong string token that nL doesnt like.

I have this code:

Code: Select all

(set 'words (parse (read-file "myfile")))
(exit)
and it I get the error:

Code: Select all

ERR: string token too long : "></div>\r\n\t\t\t\t\t\n\t\t\t\t</div>\n\t\t\t</div> \n\t\t "
Its an html source file i am trying to pull pieces from into lists.

Thank you for direction!

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: Parsing file gives ERR: string token too long :

Post by Lutz »

When you use 'parse' without the optional break string parameter, then the text is parsed as if newLISP source is read. newLISP source has string length limitations of the "..." quoted strings of 2048 characters. For longer strings the [text], [/text] tags must be used. Another limitation are symbol tokens in newLISP source which cannot be longer than 255 characters.

Use a string break pattern (either simple of regular expression), and the problem will go away.

joejoe
Posts: 173
Joined: Thu Jun 25, 2009 5:09 pm
Location: Denver, USA

Re: Parsing file gives ERR: string token too long :

Post by joejoe »

Thank you, Lutz, for pointing out how to use a sting break. Didnt know what that meant before.

Might I also ask what would be the function to use to just pick out a section of the file (using a regular expression) instead of having parse make the entire file a string before I get to the elements Im after.

I first thought to use find or regex, but it seems most of the functions are prepared for nL strings already created.

thank you.

cormullion
Posts: 2038
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W
Contact:

Re: Parsing file gives ERR: string token too long :

Post by cormullion »

I think you can use search for this.

Code: Select all

(set 'file (open "program.c" "r"))
(while (search file "#define (.*)" true 0) 
   (println $1))
(close file)
I don't know if this avoids reading the whole file into memory or not...

joejoe
Posts: 173
Joined: Thu Jun 25, 2009 5:09 pm
Location: Denver, USA

Re: Parsing file gives ERR: string token too long :

Post by joejoe »

Thanks cormullion :)

I dont yet know enough to make use of the code you put in your post, but I used the example above this from the manual and it works great, no errors.

I appreciate the help. Thanks cormullion and Lutz!

Locked