Page 1 of 1

parsing numbers (error?)

Posted: Thu Feb 08, 2007 1:45 pm
by Fanda
It seems that there is an error in 'parse'. It behaves differently on different numbers:

Code: Select all

> (parse "Feb 07,2007")
("Feb" "07" "," "2007")

> (parse "Feb 08,2007")
("Feb" "0" "8" "," "2007")

Code: Select all

> (dotimes (i 20) (println (parse (format "Feb %d,2007" i))))
("Feb" "0" "," "2007")
("Feb" "1" "," "2007")
("Feb" "2" "," "2007")
("Feb" "3" "," "2007")
("Feb" "4" "," "2007")
("Feb" "5" "," "2007")
("Feb" "6" "," "2007")
("Feb" "7" "," "2007")
("Feb" "8" "," "2007")
("Feb" "9" "," "2007")
("Feb" "10" "," "2007")
("Feb" "11" "," "2007")
("Feb" "12" "," "2007")
("Feb" "13" "," "2007")
("Feb" "14" "," "2007")
("Feb" "15" "," "2007")
("Feb" "16" "," "2007")
("Feb" "17" "," "2007")
("Feb" "18" "," "2007")
("Feb" "19" "," "2007")
("Feb" "19" "," "2007")

> (dotimes (i 20) (println (parse (format "Feb 0%d,2007" i))))
("Feb" "00" "," "2007")
("Feb" "01" "," "2007")
("Feb" "02" "," "2007")
("Feb" "03" "," "2007")
("Feb" "04" "," "2007")
("Feb" "05" "," "2007")
("Feb" "06" "," "2007")
("Feb" "07" "," "2007")
("Feb" "0" "8" "," "2007")
("Feb" "0" "9" "," "2007")
("Feb" "010" "," "2007")
("Feb" "011" "," "2007")
("Feb" "012" "," "2007")
("Feb" "013" "," "2007")
("Feb" "014" "," "2007")
("Feb" "015" "," "2007")
("Feb" "016" "," "2007")
("Feb" "017" "," "2007")
("Feb" "01" "8" "," "2007")
("Feb" "01" "9" "," "2007")
("Feb" "01" "9" "," "2007")
Fanda

Posted: Thu Feb 08, 2007 2:10 pm
by nigelbrown
looks like it interprets numbers as octal if leading 0 and 07 is valid octal but 08 isn't so becomes 0 and 8

Manual says
parse tokenizes according to newLISP's internal parsing rules.

and numbers section says
Octals start with an optional + (plus) or - (minus) sign and a 0 (zero), followed by any combination of the octal digits: 01234567. Any other character ends the octal number.

Nigel

Posted: Thu Feb 08, 2007 2:51 pm
by Lutz
yes, exactly like Nigel explains, but if you specify the break-up string in 'parse' then it will parse as expected:

Code: Select all

> (dotimes (i 20) (println (parse (format "Feb 0%d,2007" i) "\\s|," 0)))
("Feb" "00" "2007")
("Feb" "01" "2007")
("Feb" "02" "2007")
("Feb" "03" "2007")
("Feb" "04" "2007")
("Feb" "05" "2007")
("Feb" "06" "2007")
("Feb" "07" "2007")
("Feb" "08" "2007")
("Feb" "09" "2007")
("Feb" "010" "2007")
("Feb" "011" "2007")
("Feb" "012" "2007")
("Feb" "013" "2007")
("Feb" "014" "2007")
("Feb" "015" "2007")
("Feb" "016" "2007")
("Feb" "017" "2007")
("Feb" "018" "2007")
("Feb" "019" "2007")
("Feb" "019" "2007")
> 
See als the new 'parse-date'

Lutz

ps: "internal parsing rules" means: like newISP source code

Posted: Thu Feb 08, 2007 5:37 pm
by cormullion
it's a bit like the 'bug' I had to track down. My program worked perfectly for 9 months and then went wrong... :-)

http://newlisper.blogspot.com/2006/09/my-mistake.html

Posted: Fri Feb 09, 2007 12:24 pm
by Fanda
Octal numbers seem to cause a confusion - could we change their format to something similar like HEX numbers???

"\x12" -> "\o22"
0x12 -> 0o22 [zero - small "o"]

or maybe use "c":
"\x12" -> "\c22"
0x12 -> 0c22 [zero - small "c"]

Fanda

Posted: Fri Feb 09, 2007 12:49 pm
by Lutz
It is better to stay with standard conventions in this case. C, Perl and Python do all the same thing here.

Lutz