bizarre string bug [nevermind]

Notices and updates
Locked
itistoday
Posts: 429
Joined: Sun Dec 02, 2007 5:10 pm
Contact:

bizarre string bug [nevermind]

Post by itistoday »

never mind, my stupid mistake, see posts below.

Basically, newLISP claims that two strings, that are identical, are not.

Given this input file named "test.csv":

Code: Select all

Date, Type, Net, 
"11/22/2008","Web Accept Payment Received","15.00",
Here is the output of the program:

Code: Select all

'Web Accept Payment Received' vs 'Web Accept Payment Received'
data-type=Web Accept Payment Received, filter-type=Web Accept Payment Received
(!= date-type filter-type)=true
0 27
The string "Web Accept Payment Received" is stored in two variables, filter-type, and date-type. It is defined in the script for filter-type, and it's read from the file and stored in data-type.

When printing the contents of those variables, they appear identical. I even iterated over each character and verified they were the same ASCII value. However, when asking for the length, newLISP claims data-type is 0 characters in length, and when checking equality (as shown above), it fails.

Here is the script so far (it's not complete), see the part where it says "BUG HERE":

Code: Select all

#!/usr/bin/newlisp

; =============
; = configure =
; =============

(set 'csv-delimiter		",")
(set 'header-date		"Date")
(set 'header-amount		"Net")
(set 'header-type		"Type")

; filters
(set 'filter-type		"Web Accept Payment Received")
(set 'filter-min-amount	15)
(set 'filter-max-amount	20)

; ============
; = end conf =
; ============

; --------------
(context 'DataStore)

(context MAIN)
; --------------

(define (fail) (apply println (args)) (exit 1))
(define-macro (fail-on-nil) (doargs (arg) (if (nil? (eval arg)) (fail arg " is nil"))))
(define-macro (paras)
	(join (map (lambda (x)
		(string x "=" (eval x))
	) (args)) ", ")
)

(set 'csv-input-file (main-args 2))

(if-not csv-input-file (fail "usage: ./" (main-args 1) " <paypal>"))
(if-not (file? csv-input-file) (fail "no such file: " csv-input-file))
(if-not (regex "(.*)\.csv" csv-input-file) (fail "not a csv file: " csv-input-file))

(set 'csv-output-file (append $1 "-out.csv"))

(set 'fin (open csv-input-file "r"))
(set 'fout (open csv-output-file "w"))

(set 'header-list (map trim (parse (read-line fin) csv-delimiter)))

; set the indexes
(set 'index-date (find header-date header-list))
(set 'index-amount (find header-amount header-list))
(set 'index-type (find header-type header-list))
(fail-on-nil index-type index-amount index-date)

; write the header
(write-line fout (append "Date" csv-delimiter " Copies Sold" csv-delimiter " "))

(while (read-line fin)
	(set 'data-list (map (fn (x) (trim x "\"")) (parse (current-line) csv-delimiter)))
	(set 'data-date (data-list index-date))
	(set 'data-type (data-list index-type))
	(set 'data-amount (float (data-list index-amount)))	
	(fail-on-nil data-date data-amount data-type)
	
	
	;; BUG HERE
	(println "'" data-type "' vs '" filter-type "'")
	(println (paras data-type filter-type))
	(println (paras (!= date-type filter-type)))
	(println (length date-type) " " (length filter-type))
	
	; (dostring (c data-type) (println c " - " (char c)))
	; (println)
	; (dostring (c filter-type) (println c " - " (char c)))
	
	(exit)
	;; END BUG
	
	(if (and (= (length date-type) (length filter-type)) (>= data-amount filter-min-amount) (<= data-amount filter-max-amount))
		(begin
			(println data-date " " data-amount)
			(DataStore data-date (+ (if $it $it 0) 1))
		)
		(println "skipping " data-date " $" data-amount)
	)
)


(println "result: " (DataStore))

(close fin)
(close fout)
(exit)
Last edited by itistoday on Sun Jan 11, 2009 4:41 pm, edited 2 times in total.
Get your Objective newLISP groove on.

cormullion
Posts: 2038
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W
Contact:

Post by cormullion »

I have no idea what's happening here... :) I'm even getting confused between your two variables "datE-type" and "datA-type":

Code: Select all

(println (paras data-type filter-type)) 
   (println (paras (!= date-type filter-type))) 
   (println (length date-type) " " (length filter-type)) 
Are these supposed to be different?

DrDave
Posts: 126
Joined: Wed May 21, 2008 2:47 pm

Post by DrDave »

cormullion wrote:I have no idea what's happening here... :) I'm even getting confused between your two variables "datE-type" and "datA-type":

Code: Select all

(println (paras data-type filter-type)) 
   (println (paras (!= date-type filter-type))) 
   (println (length date-type) " " (length filter-type)) 
Are these supposed to be different?
I think date-type is a typo, and occurs more than once.
...it is better to first strive for clarity and correctness and to make programs efficient only if really needed.
"Getting Started with Erlang" version 5.6.2

itistoday
Posts: 429
Joined: Sun Dec 02, 2007 5:10 pm
Contact:

Post by itistoday »

cormullion wrote:I have no idea what's happening here... :) I'm even getting confused between your two variables "datE-type" and "datA-type":

Code: Select all

(println (paras data-type filter-type)) 
   (println (paras (!= date-type filter-type))) 
   (println (length date-type) " " (length filter-type)) 
Are these supposed to be different?
Don't feel bad, it appears I was confused between those two variables as well! Thanks! It appears you've fixed the "bug"! (I feel rather silly now :-p).
Get your Objective newLISP groove on.

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Post by Lutz »

In your case it was just a typo, but actually there is a case where two strings look alike but are different; when strings contain binary zeros as a result of working with imported C functions:

Code: Select all

> (set 'A "abc" 'B "abc\000\000") ; B has trailing two 0's

> (println A " " B) ; println strips the 0's
abc abc

> (= A B)
nil

> (println (length A) ":" (length B))
3:5

> (= A (get-string B))
true
> 
Here 'get-string' is used to strip trailing zeros.

For those of you who are C programmers:

Code: Select all

> (import "libc.dylib" "strcat")

> (set 's (dup "\000" 20))  ; create a buffer with 20 zeros

> (strcat s "abc")

> (strcat s "def")

> (= s "abcdef")
nil

> (= (get-string s) "abcdef")
true
> 

cormullion
Posts: 2038
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W
Contact:

Post by cormullion »

you were unlucky - two commonly typed words differing only by the last letter, and both "e" and "a" similar enough at small point sizes. My newLISP coding is now done using 18 point type.. :)

btw - hope the spying business is doing well!

itistoday
Posts: 429
Joined: Sun Dec 02, 2007 5:10 pm
Contact:

Post by itistoday »

Lutz wrote:In your case it was just a typo, but actually there is a case where two strings look alike but are different; when strings contain binary zeros as a result of working with imported C functions:
Thanks for the heads up Lutz!
cormullion wrote:you were unlucky - two commonly typed words differing only by the last letter, and both "e" and "a" similar enough at small point sizes. My newLISP coding is now done using 18 point type.. :)
Yeah... Monaco 12pt. here, could be the culprit, but I just can't stand using large fonts for coding, I like to see as much code as possible at once without having to scroll around.
btw - hope the spying business is doing well!
Thanks! :-p

I'm planning on covering newLISP on the site's blog as well. One of the things I'd like to share is my TextMate bundle for newLISP, here's a screenshot of some of the highlighting:

Image
Get your Objective newLISP groove on.

Locked