Optimizing a function

For the Compleat Fan
Locked
Jeff
Posts: 604
Joined: Sat Apr 07, 2007 2:23 pm
Location: Ohio
Contact:

Optimizing a function

Post by Jeff »

I have been playing with some of the shootout functions and trying to get decent results with newlisp. Here is the fastest implementation I have so far for the sieve of eratosthenes:

Code: Select all

(define (sieve size , flags total)
  (set 'flags (array (+ size 1))
       'total 0)
  (for (i 2 (- size 1))
    (when (not (flags i))
      (for (k (+ i i) size i (> k size))
        (nth-set (flags k) 1))
      (inc 'total)))
  total)

(define (m n)
  (* (pow 2 n) 10000))

(for (i 9 7)
  (let (n (m i))
    (println (format "Primes up to %8d %8d" n (sieve n)))))
It's roughly the same as the incrementing algorithm in the nsieve programs on the shootout site. I'm able to run it in around 20 seconds on my machine (which is pretty fast). The PHP implementation runs quite a bit faster than that. That shouldn't be right... PHP isn't really optimized for this kind of work. Any ideas on how to speed this up?[/code]
Jeff
=====
Old programmers don't die. They just parse on...

Artful code

cormullion
Posts: 2038
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W
Contact:

Post by cormullion »

14 seconds here...

(iMac 2GHz Core Duo)

for comparison, the Ruby version (http://shootout.alioth.debian.org/gp4/b ... =ruby&id=1) took 48 seconds.

I don't see any obvious ways to improve it...

Lutz used character vectors and cpymem for his... http://www.newlisp.org/syntax.cgi?bench ... ewlisp.txt

xytroxon
Posts: 296
Joined: Tue Nov 06, 2007 3:59 pm
Contact:

Post by xytroxon »

To really optimise code , you have to use "small iron" cpus to see the problems ;)

I tried nsieve on an old Win98 laptop (48mb ram 120 mhz) machine with PHP 5.2.2, LUA 5.1, and Python 2.5...

-------
->"C:\Program Files\php5\php.exe" test.php
Primes up to 10000 1229
Primes up to 0 0
Primes up to 0 0
------
->"C:\Lua\lua5.1.exe" test.lua
Primes up to 40000 4203
Primes up to 20000 2262
Primes up to 10000 1229
------
->C:\Python25\PYTHON.EXE test.py
Traceback (most recent call last):
File "test.py", line 18, in <module>
nsieve((1 << (int(sys.argv[1]) - k)) * 10000)
IndexError: list index out of range
------

PHP and Lua both loaded and ran in a few seconds... Until hitting memory limits and returning partial results...

Python crashes, I think for not being able to solce the problem due to lack of memory space... Required indentation makes copy and paste Python programming at best a time consuming pain and at worst, a dangerous practice!

Your newLISP code Jeff, "runs forever" at 100% CPU usage, I killed it after 10 minutes... It continually accesses the hard drive to use the hard drive virtual memory... (The functional language/VN disk thrash record on this machine is held by PLT Scheme's Dr. Scheme IDE, which takes almost 50 minutes to load and run!)

So you might be having memory "juggling" issues that don't show up on large memory machines, but requires more time than expected for your code to execute...

Swans and Lisp(s) look gracefully smooth on the surface, while paddling like heck under the surface!

-- xytroxon
"Many computers can print only capital letters, so we shall not use lowercase letters."
-- Let's Talk Lisp (c) 1976

Jeff
Posts: 604
Joined: Sat Apr 07, 2007 2:23 pm
Location: Ohio
Contact:

Post by Jeff »

Well, look at the numbers. It's making arrays of 512 * 10000 elements. Run it with smaller sizes and multiply the results. The array is used for constant access speeds.

The same with the other languages.

Xytroxon - it ran consistently at 20-21 seconds on my 2.5 ghz quad-core ppc with 4 gigs of ram.
Jeff
=====
Old programmers don't die. They just parse on...

Artful code

xytroxon
Posts: 296
Joined: Tue Nov 06, 2007 3:59 pm
Contact:

Post by xytroxon »

Jeff wrote:Well, look at the numbers. It's making arrays of 512 * 10000 elements. Run it with smaller sizes and multiply the results. The array is used for constant access speeds.

The same with the other languages.

Xytroxon - it ran consistently at 20-21 seconds on my 2.5 ghz quad-core ppc with 4 gigs of ram.
Backed it down to 1000

D:\MinGW - 18:14:04.58 Tue 09-02-2008
->"c:\program files\php5\php.exe" test.php
Primes up to 1000 168
Primes up to 0 0
Primes up to 0 0

D:\MinGW - 18:14:30.78 Tue 09-02-2008
->nl test.nl
Primes up to 512000 42445
Primes up to 256000 22525
Primes up to 128000 11987

D:\MinGW - 18:16:20.52 Tue 09-02-2008
->lua test.lua
"C:\Lua\lua5.1.exe" test.lua
Primes up to 4000 550
Primes up to 2000 303
Primes up to 1000 168
----
The spaces all look right, but Python is still MIA...
-----

While newLISP takes a few minutes to run... It looks like it is doing the calculations on this machine! Amazing!!!

Is thw PHP code the problem? Or is the Zend engine kicking in to help the speed???

This was fun... Timw for supper!!!
"Many computers can print only capital letters, so we shall not use lowercase letters."
-- Let's Talk Lisp (c) 1976

Kazimir Majorinc
Posts: 388
Joined: Thu May 08, 2008 1:24 am
Location: Croatia
Contact:

Post by Kazimir Majorinc »

Code: Select all

(setq sieve (lambda(size)
              (let ((flags (array (+ size 1)))
                    (sqrtsize (sqrt size))
                    (total 0))
                    
                   (for (i 2 sqrtsize)
                     (when (not (flags i))
                           (for (k (* i i) size i)
                             (nth-set (flags k) 0))
                           (inc 'total)))
                           
                   (for (i (+ sqrtsize 1) size)
                     (unless (flags i)
                             (inc 'total)))
                     
                   total)))
This one is ~ 1.5 times faster, but trick is mathematical, not programming.

cormullion
Posts: 2038
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W
Contact:

Post by cormullion »

I can do it in 11 seconds (as opposed to 14) if I get spawn on the case.

Kasimir's function and spawn get me down to 6 seconds... :)

unixtechie
Posts: 65
Joined: Sat Sep 06, 2008 6:30 am

Post by unixtechie »

.. what is it I am doing wrong?
running your snippet on a 1.8GHz single-core Intel compaq/HP laptop with 1GB of memory takes 13 seconds under Linux

cormullion
Posts: 2038
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W
Contact:

Post by cormullion »

Looks like an unsurprising time to me. My 2Ghz Core Duo runs Jeff's code in 14 seconds (presumably without engaging both CPUs). Allowing for the way I'm running scripts (not from the command line but from an editor) that seems to match yours. Jeff's using PPCs rather than Intels - are they a bit slower?

Jeff
Posts: 604
Joined: Sat Apr 07, 2007 2:23 pm
Location: Ohio
Contact:

Post by Jeff »

For math and graphics, PPCs should be faster.
Jeff
=====
Old programmers don't die. They just parse on...

Artful code

cormullion
Posts: 2038
Joined: Tue Nov 29, 2005 8:28 pm
Location: latiitude 50N longitude 3W
Contact:

Post by cormullion »

Interesting - and puzzling too, that a "2.5 ghz quad-core ppc with 4 gigs of ram" manages 20 seconds whereas a "2 ghz intel core duo with 1 gig of ram" manages 14... Still, the results are in the same general area - not 0.3 seconds, for example.

Locked