Stemming in newLISP

Q&A's, tips, howto's
Locked
methodic
Posts: 58
Joined: Tue May 10, 2005 5:04 am

Stemming in newLISP

Post by methodic »

Coded this real quick for a project I've been working on.

Code: Select all

;; stemmer for newLISP, relies on the following code:
;; http://tartarus.org/~martin/PorterStemmer/c_thread_safe.txt
;;
;; download, rename to stemmer.c and compile with:
;; gcc -fPIC -c stemmer.c
;; gcc -shared -o libstemmer.so stemmer.o
;;
;; this way was faster than porting the cLISP one :)

(constant 'STEMLIB "/home/tony/wiki/libstemmer.so")

(import STEMLIB "create_stemmer")
(import STEMLIB "stem")
(import STEMLIB "free_stemmer")

(define (stemmer words)
  (set 'new_words '())

  (dolist (w words)
    (set 's (create_stemmer))
    (set 'len (stem s w (- (length w) 1) ))
    (free_stemmer s)

    (set 'n (slice w 0 (+ len 1)))
    (push n new_words -1)
  )
  new_words
)

(set 'sentence "Martin Scorsese directed the film Taxi Driver")
(set 'new_sentence (join (stemmer (parse sentence " ")) " "))

(println new_sentence)
(exit)
[tony@lcars ~/wiki]$ ./stemmer.lsp
Martin Scorses direct the film Taxi Driver
Of course you can change this to be it's own context and such, I just wanted to show a quick example.

Locked