parsing numbers (error?)

Pondering the philosophy behind the language
Locked
Fanda
Posts: 253
Joined: Tue Aug 02, 2005 6:40 am
Contact:

parsing numbers (error?)

Post by Fanda »

It seems that there is an error in 'parse'. It behaves differently on different numbers:

Code: Select all

> (parse "Feb 07,2007")
("Feb" "07" "," "2007")

> (parse "Feb 08,2007")
("Feb" "0" "8" "," "2007")

Code: Select all

> (dotimes (i 20) (println (parse (format "Feb %d,2007" i))))
("Feb" "0" "," "2007")
("Feb" "1" "," "2007")
("Feb" "2" "," "2007")
("Feb" "3" "," "2007")
("Feb" "4" "," "2007")
("Feb" "5" "," "2007")
("Feb" "6" "," "2007")
("Feb" "7" "," "2007")
("Feb" "8" "," "2007")
("Feb" "9" "," "2007")
("Feb" "10" "," "2007")
("Feb" "11" "," "2007")
("Feb" "12" "," "2007")
("Feb" "13" "," "2007")
("Feb" "14" "," "2007")
("Feb" "15" "," "2007")
("Feb" "16" "," "2007")
("Feb" "17" "," "2007")
("Feb" "18" "," "2007")
("Feb" "19" "," "2007")
("Feb" "19" "," "2007")

> (dotimes (i 20) (println (parse (format "Feb 0%d,2007" i))))
("Feb" "00" "," "2007")
("Feb" "01" "," "2007")
("Feb" "02" "," "2007")
("Feb" "03" "," "2007")
("Feb" "04" "," "2007")
("Feb" "05" "," "2007")
("Feb" "06" "," "2007")
("Feb" "07" "," "2007")
("Feb" "0" "8" "," "2007")
("Feb" "0" "9" "," "2007")
("Feb" "010" "," "2007")
("Feb" "011" "," "2007")
("Feb" "012" "," "2007")
("Feb" "013" "," "2007")
("Feb" "014" "," "2007")
("Feb" "015" "," "2007")
("Feb" "016" "," "2007")
("Feb" "017" "," "2007")
("Feb" "01" "8" "," "2007")
("Feb" "01" "9" "," "2007")
("Feb" "01" "9" "," "2007")
Fanda

nigelbrown
Posts: 429
Joined: Tue Nov 11, 2003 2:11 am
Location: Brisbane, Australia

Post by nigelbrown »

looks like it interprets numbers as octal if leading 0 and 07 is valid octal but 08 isn't so becomes 0 and 8

Manual says
parse tokenizes according to newLISP's internal parsing rules.

and numbers section says
Octals start with an optional + (plus) or - (minus) sign and a 0 (zero), followed by any combination of the octal digits: 01234567. Any other character ends the octal number.

Nigel

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Post by Lutz »

yes, exactly like Nigel explains, but if you specify the break-up string in 'parse' then it will parse as expected:

Code: Select all

> (dotimes (i 20) (println (parse (format "Feb 0%d,2007" i) "\\s|," 0)))
("Feb" "00" "2007")
("Feb" "01" "2007")
("Feb" "02" "2007")
("Feb" "03" "2007")
("Feb" "04" "2007")
("Feb" "05" "2007")
("Feb" "06" "2007")
("Feb" "07" "2007")
("Feb" "08" "2007")
("Feb" "09" "2007")
("Feb" "010" "2007")
("Feb" "011" "2007")
("Feb" "012" "2007")
("Feb" "013" "2007")
("Feb" "014" "2007")
("Feb" "015" "2007")
("Feb" "016" "2007")
("Feb" "017" "2007")
("Feb" "018" "2007")
("Feb" "019" "2007")
("Feb" "019" "2007")
> 
See als the new 'parse-date'

Lutz

ps: "internal parsing rules" means: like newISP source code

cormullion
Posts: 2038
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W
Contact:

Post by cormullion »

it's a bit like the 'bug' I had to track down. My program worked perfectly for 9 months and then went wrong... :-)

http://newlisper.blogspot.com/2006/09/my-mistake.html

Fanda
Posts: 253
Joined: Tue Aug 02, 2005 6:40 am
Contact:

Post by Fanda »

Octal numbers seem to cause a confusion - could we change their format to something similar like HEX numbers???

"\x12" -> "\o22"
0x12 -> 0o22 [zero - small "o"]

or maybe use "c":
"\x12" -> "\c22"
0x12 -> 0c22 [zero - small "c"]

Fanda

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Post by Lutz »

It is better to stay with standard conventions in this case. C, Perl and Python do all the same thing here.

Lutz

Locked