Parsing through an csv file

Q&A's, tips, howto's

Parsing through an csv file

Postby nix » Fri May 03, 2013 1:53 pm

Hello,

I am trying to parse through sn csv file using http://static.artfulcode.net/newlisp/cs ... parse-file.

I am very new at NewLisp so please bear with me. Below in the codeblock is what i have started.
Code: Select all
(load '/path/csv.lsp')
(define (csv_path ("/path/file.csv")))
(csv_path)
(CSV:parse-string("LastSeen")


So I can get newlisp to parse through the whole file and its really fast much faster then python.

So what I am trying to do is parse through the csv file then I would like to filter (like how excel does) on the columns and grab certain "strings" and print them out via stdout.

Any help would be very helpful, also if someone could point me in the right direction for a mailing list or usenet group so i can lean on until I get used to programming in NewLisp.

Thanks in advance
nix
 
Posts: 2
Joined: Thu May 02, 2013 9:40 pm

Re: Parsing through an csv file

Postby cormullion » Fri May 03, 2013 2:47 pm

It looks like you need to work through the documentation a bit... :) But here's something to get started:

Code: Select all
(load "csv.lsp")
(define csv_path "file.csv")
(set 'l (CSV:parse-file csv_path))


Now that you have stored the results of the parse-file function in a list, you're free to process the list. A simple way to do this is to go through line by line and look for what you want:

Code: Select all
(dolist (e l)
  (if (find "godot" e)
      (println e)))


There are plenty of other things you can try, but keep it simple for now!

This forum is the best place for newLISP help!
cormullion
 
Posts: 2028
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W

Re: Parsing through an csv file

Postby rickyboy » Fri May 03, 2013 3:08 pm

Looks like you can use the module this way. Suppose you have a comma separated values (CSV) file called file.csv which has these contents.

Code: Select all
1,2,3,4
Bob,"Accting Dept",20
"Sue","Personnel",132
"Ted","IT", 42,"More stuff for some odd reason"

Then in newLISP, after you load the CSV module (csv.lsp), just say

Code: Select all
> (CSV:parse-file "file.csv")
(("1" "2" "3" "4")
 ("Bob" "Accting Dept" "20")
 ("Sue" "Personnel" "132")
 ("Ted" "IT" " 42" "More stuff for some odd reason"))

Notice that the contents of your csv file are now converted to lists for easier processing in newLISP (LISP meaning "LISt Processing" :).

Now, it is your job to manipulate these newLISP lists to get the info you want. I hope that makes sense.

Here's a caveat though. Be careful of a bug I just found in testing the module csv.lsp. It will not handle a corner case: this one is when your input csv line contains no value in a column. As soon as the CSV parser encounters the first such nil, it will stop processing that line. Compare the following evaluations at the REPL to see this problem.

Code: Select all
> (CSV:parse-string "1,2,3,4")
(("1" "2" "3" "4"))
> (CSV:parse-string ",2,3,4")
(())
> (CSV:parse-string "1,,2,,,6,,,,")
(("1"))
(λx. x x) (λx. x x)
rickyboy
 
Posts: 493
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Parsing through an csv file

Postby rickyboy » Fri May 03, 2013 3:10 pm

cormullion is right ... and fast on the trigger!
(λx. x x) (λx. x x)
rickyboy
 
Posts: 493
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Parsing through an csv file

Postby cormullion » Fri May 03, 2013 3:21 pm

you were busy doing some proper testing... :)
cormullion
 
Posts: 2028
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W

Re: Parsing through an csv file

Postby rickyboy » Fri May 03, 2013 6:39 pm

Empty field bug fix

The bug demo/fix needs a test csv file.

Code: Select all
$ cat > test-for-empty-fields.csv
1,2,3,four,"five",6,,,,
,2,,4,5,,,9,10
,,3,4,5,,,,,
,,,4,,,,9,

Fire up newLISP and recreate the bug with the test file:

Code: Select all
$ newlisp csv.lsp
newLISP v.10.4.5 on OSX IPv4/6 UTF-8 libffi, execute 'newlisp -h' for more info.

> (CSV:parse-file "test-for-empty-fields.csv")
(("1" "2" "3" "four" "five" "6" "" "" "")
 ()
 ("")
 ("" ""))

Fix it. This is a hot (online) fix, of course. However, for a permanent fix, you will need to replace the definition of regex-token-empty in csv.lsp with the following definition:

Code: Select all
> (context 'CSV)
CSV> (define (regex-token-empty delimiter) (format "^%s" delimiter))
CSV> (context 'MAIN)
> ;; Now let's try the same test.
> (CSV:parse-file "test-for-empty-fields.csv")
(("1" "2" "3" "four" "five" "6" "" "" "")
 ("" "2" "" "4" "5" "" "" "9" "10")
 ("" "" "3" "4" "5" "" "" "" "")
 ("" "" "" "4" "" "" "" "9"))
> ;; Looks much better. :)

Also, my first test input file file.csv works the same as it did before the fix. Yes! (Whew! :)

Code: Select all
> ;; Regression test:
> (CSV:parse-file "file.csv")
(("1" "2" "3" "4")
 ("Bob" "Accting Dept" "20")
 ("Sue" "Personnel" "132")
 ("Ted" "IT" " 42" "More stuff for some odd reason"))
(λx. x x) (λx. x x)
rickyboy
 
Posts: 493
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Parsing through an csv file

Postby cormullion » Fri May 03, 2013 7:26 pm

cormullion
 
Posts: 2028
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W

Re: Parsing through an csv file

Postby rickyboy » Fri May 03, 2013 7:31 pm

Git outta here! :-)
<rimshot/>
(λx. x x) (λx. x x)
rickyboy
 
Posts: 493
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Parsing through an csv file

Postby rickyboy » Fri May 03, 2013 8:13 pm

Now, I'm truly a hipster coder. Gee thanks, cormullion. :/
https://github.com/kanendosei/artful-newlisp/pull/1
(λx. x x) (λx. x x)
rickyboy
 
Posts: 493
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Re: Parsing through an csv file

Postby cormullion » Sat May 04, 2013 7:53 am

Impressed! That'll wake Kanen up... :)
cormullion
 
Posts: 2028
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W


Return to newLISP in the real world

Who is online

Users browsing this forum: No registered users and 2 guests

cron