5 Cent tip for today [ Decode me ]
5 Cent tip for today [ Decode me ]
Just playing with some form of decoding the example below could
become pretty complex if you would enhance it more...
If you like working with lists and newlisp then here, i guess, a nice riddle to break just
you mind over it for a couple of minutes...
you have two lists available :-) which is enough to decode...
http://www.nodep.nl/downloads/newlisp/decodeme.lsp
Decoder function solutions can be posted here...(keep it as small as posisble ;-)
enjoy...Norman.
become pretty complex if you would enhance it more...
If you like working with lists and newlisp then here, i guess, a nice riddle to break just
you mind over it for a couple of minutes...
you have two lists available :-) which is enough to decode...
http://www.nodep.nl/downloads/newlisp/decodeme.lsp
Decoder function solutions can be posted here...(keep it as small as posisble ;-)
enjoy...Norman.
-- (define? (Cornflakes))
(set 'a (-1 (length (flat coordinates))))
(set 'x (map (fn (x) (first (ref x coordinates))) (sequence 0 a)))
(print (join (map (fn (y) (nth y individuals)) x)))
(set 'x (map (fn (x) (first (ref x coordinates))) (sequence 0 a)))
(print (join (map (fn (y) (nth y individuals)) x)))
Code: Select all
;;
;; And this is what you see when it worked.
;;
;; Playing with different ways of displaying code content
;; the result below is a variation using lists.
;;
;; Enjoy, Norman.
;;
(pretty-print 65535)
(set 'original '())
(set 'coordinates '())
(set 'individuals '())
; make it a character based list from original file
(set 'original (explode (set 'infile (read-file "./index.lsp") )))
; get the unique characters from the list
(set 'individuals (unique original))
; count per unique character the amount it appears in the originial file
(set 'amount (count individuals original))
; create per unique character an index and create an indexed list
(dolist (y individuals)
(define (member? x) (= y x ))
(println y)
(push (index member? original) coordinates -1))
(save "./coordinated.db" 'individuals 'coordinates)
Oke now Sammo has made his statement ;-)
Im able to continue with my question to Lutz on the above point. :)
I was seeking for a very quick function that was able to return index numbers
from a list...the only buildin function for this job seems to be 'index.
However. the 'index function has a little (in some cases) drawback and
that is that the indexed numbers it returns come from a define...
Taking an example file of i.e. 500 or 700 Kb then the function in the
source-code above (dolist......) is very very slow..takes over 5 minutes...
(No problem at all ...its a hard job...But indeed i wonder if 'map is quicker?)
Besides.. I dont like the (dolist....) function at all in my example because
it does a double check on someting i already have..and thats the individuals...
Actualy it would be nice if this could work -> (index individuals original)
Still im seeking for a very fast way to return indexed numbers from a list?
So..what is the fasted alternative? (im already reverse-enigering Sammo's example)
Regards, Norman.
Im able to continue with my question to Lutz on the above point. :)
I was seeking for a very quick function that was able to return index numbers
from a list...the only buildin function for this job seems to be 'index.
However. the 'index function has a little (in some cases) drawback and
that is that the indexed numbers it returns come from a define...
Taking an example file of i.e. 500 or 700 Kb then the function in the
source-code above (dolist......) is very very slow..takes over 5 minutes...
(No problem at all ...its a hard job...But indeed i wonder if 'map is quicker?)
Besides.. I dont like the (dolist....) function at all in my example because
it does a double check on someting i already have..and thats the individuals...
Actualy it would be nice if this could work -> (index individuals original)
Still im seeking for a very fast way to return indexed numbers from a list?
So..what is the fasted alternative? (im already reverse-enigering Sammo's example)
Regards, Norman.
-- (define? (Cornflakes))
Hi Norman,
It seems that pushing those character indices onto the end of lists gets very, very expensive when the lists get large. The following uncommented solution (I hope you can read it) was running as slowly as yours (or even more slowly) until I tried pushing onto the front of the lists -- that is, (push value list-of-stacks stack-index 0) instead of (push value list-of-stacks stack-index -1). When I made this change, I can index the newlisp_manual.html file (455KB) in about 8 seconds -- including writing the resulting 3.2MB file to the disk.Pushing the character indicies onto the front of the lists is okay and doesn't need correcting later on; the only thing that matters is that each character's index get pushed onto the correct stack.
If you really want the indicies in ascending order in each stack, the following one-line modification to the above does the trick:On a big file (e.g., newlisp_manual.html), reversing the stacks after populating them took about six percent more time on my WIN2K 1.4GHz laptop.
A couple of more trial runs suggests that this scales pretty well.
62KB file --> 0.75 seconds
445KB file --> 8.5 seconds
1.17MB file --> 20.8 seconds
It seems that pushing those character indices onto the end of lists gets very, very expensive when the lists get large. The following uncommented solution (I hope you can read it) was running as slowly as yours (or even more slowly) until I tried pushing onto the front of the lists -- that is, (push value list-of-stacks stack-index 0) instead of (push value list-of-stacks stack-index -1). When I made this change, I can index the newlisp_manual.html file (455KB) in about 8 seconds -- including writing the resulting 3.2MB file to the disk.
Code: Select all
(set 'time-begin (time-of-day))
(set 'infile "c:/newlisp/newlisp_manual.html")
(set 'outfile "c:/newlisp/test-coords.txt")
(set 'orig (explode (read-file infile)))
(set 'indiv (unique orig))
(set 'coord (dup '() (length indiv)))
(set 'ilist (map (fn (x) (find x indiv)) orig))
(set 'x -1)
(dolist (i ilist) (push (inc 'x) coord i 0))
(save outfile 'indiv 'coord)
(println (- (time-of-day) time-begin))
If you really want the indicies in ascending order in each stack, the following one-line modification to the above does the trick:
Code: Select all
(set 'infile "c:/newlisp/newlisp_manual.html")
(set 'outfile "c:/newlisp/test-coords.txt")
(set 'orig (explode (read-file infile)))
(set 'indiv (unique orig))
(set 'coord (dup '() (length indiv)))
(set 'ilist (map (fn (x) (find x indiv)) orig))
(set 'x -1)
(dolist (i ilist) (push (inc 'x) coord i 0))
(set 'coord (map reverse coord))
(save outfile 'indiv 'coord)
A couple of more trial runs suggests that this scales pretty well.
62KB file --> 0.75 seconds
445KB file --> 8.5 seconds
1.17MB file --> 20.8 seconds
Sam's solution seems to be the best, but there is one other potential bottle neck in:
(set 'ilist (map (fn (x) (find x indiv)) orig))
which is the: (find x indiv)
It doesn't matter in Norman's task because 'indiv' is such a short list. But imagine you deal with an 'indiv' with thousands of members.
This is where newLISP arrays come in handy:
This builds an array vector where each character is at the position of its value. So at "A" index 65 you have the index position of "A" in indiv.
Now the sequential (find x indiv) is replaced with (nth (char x) ivec) which does random access.
In Norman's example this saves only about 0.5% of time, but it could be worth doing it in other situations with very long ndexing vectors.
Lutz
(set 'ilist (map (fn (x) (find x indiv)) orig))
which is the: (find x indiv)
It doesn't matter in Norman's task because 'indiv' is such a short list. But imagine you deal with an 'indiv' with thousands of members.
This is where newLISP arrays come in handy:
Code: Select all
; old code
; (set 'ilist (map (fn (x) (find x indiv)) orig))
; new code
(set 'ivec (array 256))
(set 'pos -1)
(dolist (i indiv) (nth-set (char i) ivec (inc 'pos)))
(set 'ilist (map (fn (x) (nth (char x) ivec)) orig))
Now the sequential (find x indiv) is replaced with (nth (char x) ivec) which does random access.
In Norman's example this saves only about 0.5% of time, but it could be worth doing it in other situations with very long ndexing vectors.
Lutz
Hello Lutz,
Thanks for the addon...Actualy speed seems to be more important in
my tool im currently building then expected. It will for sure be eating
MB's greater then 50 MB..or more...
So that leaves me with two remaining questions...
Should I use 'index above 'find or always switch to 'array's when it comes
to performance and what will 'map do on performace ?
Regards, Norman.
Thanks for the addon...Actualy speed seems to be more important in
my tool im currently building then expected. It will for sure be eating
MB's greater then 50 MB..or more...
So that leaves me with two remaining questions...
Should I use 'index above 'find or always switch to 'array's when it comes
to performance and what will 'map do on performace ?
Regards, Norman.
-- (define? (Cornflakes))
'map' is pretty good performance wise, becusae it goes sequentially thru the list one by one, so 'map' will always scale well. The problem with 'find' is that it searches sequentially, so if the list is very long it becomes an issue.
I would use 'index' only if you are really interested to get various positions back in a list. But that was not the case in your riddle, because every letter occured only once in the individual vector and 'find' returns as soon as it found one. 'index' keeps on searching hoping to find more until the end of the list.
The array versus list issue is difficult to decide. In your riddle it did not make a real difference and Sam's pure list code was much shorter. I always start coding as a list and only goto arrays when time seems to get a problem. As a thumb rule I would say when the list doesn't exceed a few hundred elements its not worth thinking about arrays. But then again in an individual case things may be different.
Lists are very good when going thru it sequentially (like map), but as soon as you have to jump around in it randomly it can get a problem.
When you are using 'nth-set' or 'set-nth' be also aware of the difference between the two. 'nth-set' is much faster becuase it doesn't need to return the whole list or array.
Lutz
I would use 'index' only if you are really interested to get various positions back in a list. But that was not the case in your riddle, because every letter occured only once in the individual vector and 'find' returns as soon as it found one. 'index' keeps on searching hoping to find more until the end of the list.
The array versus list issue is difficult to decide. In your riddle it did not make a real difference and Sam's pure list code was much shorter. I always start coding as a list and only goto arrays when time seems to get a problem. As a thumb rule I would say when the list doesn't exceed a few hundred elements its not worth thinking about arrays. But then again in an individual case things may be different.
Lists are very good when going thru it sequentially (like map), but as soon as you have to jump around in it randomly it can get a problem.
When you are using 'nth-set' or 'set-nth' be also aware of the difference between the two. 'nth-set' is much faster becuase it doesn't need to return the whole list or array.
Lutz
Sammo how long have you been programming lisp?
Its so dammm simple and soo far away sometimes...
This is the first time i see someone using "-1" in an inc..!!!
Clever ?! well yes ! i have never ever used it and its sooo damm simple...
after 15 years of programming i just realised that I was always using
the wrong tools! Pffffffff... I need a coffie.. :-)
Its so dammm simple and soo far away sometimes...
This is the first time i see someone using "-1" in an inc..!!!
Clever ?! well yes ! i have never ever used it and its sooo damm simple...
after 15 years of programming i just realised that I was always using
the wrong tools! Pffffffff... I need a coffie.. :-)
-- (define? (Cornflakes))