newlispfanclub.alh.net

Posted: **Thu Apr 15, 2010 8:57 pm**

Has anyone written a web crawler in newLISP?

I have looked, but cannot find such a beast.

Any pointers would be greatly appreciated.

Posted: **Thu Apr 15, 2010 9:32 pm**

I'd start looking here...

Posted: **Sat Apr 17, 2010 9:02 am**

kanen wrote:Has anyone written a web crawler in newLISP?

I have written later in 2009 some kind of crawler to gather information from one big goverment site.

Pretty simple thing, just several hundreds lines of code, "cgi.lsp" + regular expressions + lots of cookie romp. If you are going to make crawler without cookie, I think, simple crawler can be developed in one evening.

Posted: **Sat Apr 17, 2010 2:54 pm**

I had forgotten, after using Ruby and Python (and, of course, C) for a few years, just how fetching awesome newLISP is.

I did indeed write the simple crawler in one evening and it turns out to be quite fast.

Fritz wrote:
kanen wrote:Has anyone written a web crawler in newLISP?
I have written later in 2009 some kind of crawler to gather information from one big goverment site.

Pretty simple thing, just several hundreds lines of code, "cgi.lsp" + regular expressions + lots of cookie romp. If you are going to make crawler without cookie, I think, simple crawler can be developed in one evening.

Posted: **Wed Apr 28, 2010 6:01 am**

Hi Kanen

I wrote some simple tools for analysing websites in newLISP and used CURL for fetching urls, because (get-url) doesn't support "https". You can simple invoke CURL by the exec function.

Then using SXML for parsing the returned HTML would be the easiest.

Cheers
Hilti

newlispfanclub.alh.net

Web Crawler

Web Crawler

Re: Web Crawler

Re: Web Crawler

Re: Web Crawler

Re: Web Crawler