Page 1 of 1

Strange HikUp in 'difference

Posted: Fri Jun 01, 2007 1:51 pm
by newdep
Hi Lutz,

I have something strange in newlisp...


If I use 1 big file which contains 3 lists -> 'one 'two 'three
and I do a (load "my-big-list")

List 'one has random data and is >200.000 entries
List 'two has unique data and is >200.000 entries
List 'three has nested data, random and is >800.000 entries

then I do a (set 'one (difference two one true))
that is normal and is fast...



But! if I do

(load "my-big-list-one") ; list 'one
(load "my-big-list-two") ; list 'two
(load "my-big-list-three") ; list ' three

then I do a (set 'one (difference two one true))

Then that action always takes longer then the example above???
( 1 or 2 seconds longer )


I cant emagine that the way of loading data has effect, this is very odd...

Norman.

Posted: Fri Jun 01, 2007 2:09 pm
by newdep
I think its something else...

I run newlisp with "newlisp -m 256" this is fine..

But I also do , just befor i do 'difference, a (unqiue (flat three))

I exclude the fact that 'load has anything to do with this..

. so... below the 800.000 i had a good speed..now above the 800.000 I have a
delayed speed...

Is there a speed issue with 'flat or 'unique regarding the length of lists?
(i dont think so...but im trying to eliminate the problem..)


PS: its all done during execution of a script not from the commandline.


Norman.

Posted: Fri Jun 01, 2007 2:32 pm
by Jeff
Well, recursion is typically used in the design of functions like flat, due to the fact that there may be any number of nested levels in the list. With 800,000 entries, it would very much depend on the contents of the list for speed. If it stores string data, it will take a lot longer than symbols, for example. Also, if the list is hugely deep, that can take a long time due to recursion's overhead.

Posted: Fri Jun 01, 2007 2:49 pm
by newdep
Yes its a string list but only 1 level deep.. The odd thing is though that the
difference between 200.000 and 700.000 is nihil.

The list contains of 24.000 indicators with all 1 nested list.
The nested content vary's per list but is in total >800.000.

I think the problem started around 800.000 list entry's..

I have enought memory left , newlisp eat currently 136 MB.

..its odd that a difference of 100.000 lists does decrease 2 seconds...
(above +/- 750.000)

Norman.

Posted: Fri Jun 01, 2007 2:50 pm
by Lutz
Not sure if the 'unique' before brings any advantage. If you use 'difference' without the flag, it will give you a unique result anyway.

Internally newLISP will sort both lists involved and then do the 'unique' (for the mode without the flag) on-the-go.

It is hard to pin down these kind of differences and it may go into the opposite direction on a different OS, I have seen this in many benchmarks, which look completely different on an Intel architecture versus the PPC architecture. Both of them favouring different patterns or memory allocation/deallocation etc.
Lutz

Posted: Fri Jun 01, 2007 2:54 pm
by newdep
Yes you have a point there..

I was monitoring the memory when i started to hear the FAN blow...

Its somehow difficult to see what happens but it looks like its a kernel-memory
allocation issue indeed...

Currently the FAN is quiet again...sometimes the FAN blows for 1/2 second..

Mmmmm funny... ;-)

Posted: Fri Jun 01, 2007 6:06 pm
by newdep
yes..jeff, the 'flat is a real bugger currently.. Ill remove it..

any convertion at lists is actualy a delay...

Ill move towards pure list functions now...'ref-all will speed it up..

Posted: Fri Jun 01, 2007 6:15 pm
by newdep
My Mounth just hit the ground !!! (Doingggggg)....

Ref-all is at least 1000 times faster than 'flat or 'string on a nested list... ;-)