Neural network in Newlisp

Featuring the Dragonfly web framework
fdb
Posts: 66
Joined: Sat Nov 09, 2013 8:49 pm

Neural network in Newlisp

Interested in playing with neural networks and don't want to use or learn python? Now you can also use your favorite programming language!

I've converted the python program written by Michael Nielsen from his his excellent on-line book at http://neuralnetworksanddeeplearning.com to newLisp. Presented some challenges as I'm not really well versed in Python or the library used (numpy) and I needed to brush up my knowledge on matrices as well to get it all working reasonably fast. (thanks Lutz for the fast matrix functions!).

With this version you can play with stochastic gradient descent for a feedforward neural network.I've included load and convert functions for the MNIST dataset, which is kind of a standard to test learning algorithms against. Download it from http://yann.lecun.com/exdb/mnist/ and with below program you're good to go.

Let me know if you encounter any bugs (most probably!) and Lutz let me know if and how this can can be included as a module.

Ferry

Code: Select all

``````;; Mini-batch stochastic gradient descent learning algorithm for a feedforward neural network in newLISP. Based on a python program and book by Michael Nielsen, see http://neuralnetworksanddeeplearning.com/about.html and below for the MIT license.

;Usage (nn:sgd) with arbitrary number and sequence of named parameters .
; For instance (nn:sgd eta 1 network '(784 100 10) lmbda 1), other parameters will use default values, see below.

;; network         contains the number of neurons in the respective layers of the network.  For example, if the list was [2, 3, 1] then it would be a three-layer network, with the first layer containing 2 neurons, the second layer 3 neurons, and the third layer 1 neuron. For the included MNIST data and load functions the first (input) layer has to contain 784 neurons and the last (ouptput) layer 10. There can be 0-n  number of (hiden) layers and neurons in  between.

;; epochs          The number of times the whole dataset is traversed
;; mini-batch-size the numnber of instances trained together
;; eta             the learning rate, positive number, try 0.1 or 1 or 10 and then tweak
;; number          the number of instances to train upon (max 60000 for number+test data together for the MNIST training data
;; test-data       the number of instances to test against, used for accuracy calculation
;; w-init          weight initiliazer function, choice between default-w and large-w
;; cost            cost function, choice between quadratic and cross-entropy
;; lmbda           used for regularization to prevent overfitting, must be greater or equal to zero.
;; i-path          path to the input file, for MNIST the name of the file is train-images.idx3-ubyte,
;; l-path          path to output file For MNIST the name of the file is train-labels.idx1-ubyte, download from http://yann.lecun.com/exdb/mnist/

(context 'nn)

(define-macro (sgd)
(set
'network '(784 20 10)
'epochs 10
'mini-batch-size 10
'eta 4
'number 10000
'test-data 100
'w-init default-w
'cost cross-entropy
'lmbda 0
)
(when (args) (bind (explode (args) 2) true))
(when (< (length input) number)
(set 'biases (map (fn(x) (array x 1 (normal 0 1 x))) (rest network)))
(set 'weights (map w-init (rest network) (chop network)))
(dotimes (j epochs)
(setq start (time-of-day))
(setq mini-batches (explode (randomize (sequence 0 number)) mini-batch-size))
(dolist (mb mini-batches)
(update-mini-batch mb))
(if test-data
(silent (fork (println "Epoch:"(+ j 1)
" Duration:"(/ (sub (time-of-day) start) 1000)
" Accuracy:" (format "%2.2f%%" (mul 100 (evaluate number (+ number (- test-data 1))))))))
(println "Epoch: " j " complete"))))

(define (update-mini-batch mini-batch)
(set 'batchsize (length mini-batch))
(backprop mini-batch)
(set 'biases (map (fn(b nb) (mat - b (mat * nb (div eta batchsize))))
biases
(map (fn(x) (multiply x (array batchsize 1 '(1)))) nabla-b)))
(set 'weights (map (fn(w nw) (mat - (mat * w (sub 1 (mul eta (div lmbda number)))) (mat * nw (div eta batchsize))))
weights
nabla-w)))

(define (backprop lst)
(set 'activations (list (transpose (map(fn (x) (flat (input x)))lst)))
'b-biases (map (fn(x) (multiply x (array 1 batchsize '(1)))) biases)
'b-weights weights
'nabla-b '() 'nabla-w '() 'zs '())
(dolist (bw (map list b-biases b-weights))
(push (mat + (multiply (bw 1) (activations -1)) (bw 0)) zs -1)
(push (sigmoid (zs -1)) activations -1))
(set 'delta (cost (activations -1) (transpose (map (fn(x) (flat (output x)))lst)) (zs -1)))
(push delta nabla-b)
(push (multiply delta (transpose (activations -2))) nabla-w)
(setq x -2)
(while (> x (sub (length network)))
(set 'delta (mat * (multiply (transpose (b-weights (+ x 1))) delta) (sigmoid-prime (zs x))))
(push delta nabla-b)
(push (multiply delta (transpose (activations (sub x 1)))) nabla-w)
(dec x)))

;; weight initializers
(define (default-w x y)
(array x y (map (fn(z) (div z (pow x 0.5))) (normal 0 1 (* x y)))))

(define (large-w x y)
(array x y (normal 0 1 (* x y))))

;; cost functions

(mat * (mat - a y) (sigmoid-prime z)))

(define (cross-entropy a y z)
(mat - a y))

;; utility functions

(define (ff f)
(let(a (list (input f)))
(argmax (flat(nth '(-1 -1)(map (fn (b w) (push (sigmoid (mat + (multiply w (a -1)) b)) a -1)) biases weights))))))

(define (ev nr)
(if (= (ff nr) (argmax(flat(output nr))))
1 0))

(define (evaluate start end)
(let (teller 0)
(for (x start end)
(inc teller (ev x)))
(div teller (+(- end start)1))))

(define (load-mnist-data var file number block offset)
parse-bytes (dup "b " block)
lst '())
(for (x offset (+ (- offset 1) (* number block)) block)
(push (explode (unpack parse-bytes (slice buff x block))) lst -1))
(set var lst))true)

(load-mnist-data (sym 'input) i-path number 784 16)
(set 'input (map m-normalize (eval 'input)))true)

(load-mnist-data (sym 'output) l-path number 1 8)
(set 'output (convert (eval 'output)))true)

(define (convert labels)
(map (fn (x) (setq t (dup 0 10)) (setq (t (x 0)) 1)(explode t)) labels))

(define (m-map function matrix)
(map (fn(x) (map function x)) matrix))

(define (sigmoid-n z)
(div 1 (add 1 (exp (sub  z)))))

(define (sigmoid-prime-n z)
(let (sig (sigmoid-n z))
(mul sig (sub 1 sig))))

(define (sigmoid m)
(m-map sigmoid-n m))

(define (sigmoid-prime m)
(m-map sigmoid-prime-n m))

(define (argmax lst)
(find (apply max lst) lst))

(define (shape lst)
(list (length lst)  (length (lst 0))))

(define (normalize n (qty 255))
(div n qty))

(define (m-normalize m )
(m-map normalize m))

(define (m-log m)
(m-map log m))

(define (m-log-1 m)
(m-map log-1 m))

(define (log-1 n)
(log (sub 1 n)))

(context 'MAIN)

; NewLIps version based on Python program wriiten by Michael Nielsen, see below license of Python program.