Page 1 of 1

Character mess when reading CGI-params on UTF-8 newLISP

Posted: Wed Oct 19, 2011 10:51 pm
by Kirill
I'm using cgi.lsp to read POST following POST-ed parameters:

Code: Select all

text=тест
save-note=Save
Here's what cgi.lsp gives me (web.lsp behaves accordingly):

Code: Select all

(("save-note" "Save") ("text" "\209\130\208\181\209\129\209\130"))
тест
The output above is produced by this code:

Code: Select all

(println "Content-type: text/plain\n\n")
(println CGI:params)
(println (CGI:get "text"))
(exit)
newLISP version is this:

Code: Select all

newLISP v.10.3.4 64-bit on Linux IPv4/6 UTF-8, execute 'newlisp -h' for more info.
Maybe someone here could give me some pointers on how to get data as utf8 strings?

Regards,
Kirill

Re: Character mess when reading CGI-params on UTF-8 newLISP

Posted: Thu Oct 20, 2011 5:36 am
by Kirill
Seems neither cgi.lsp nor web.lsp is able to deal with multibyte characters in their processing of urlencoded data. URL decoding (and encoding) needs a tiny overhaul.

Re: Character mess when reading CGI-params on UTF-8 newLISP

Posted: Thu Oct 20, 2011 5:44 am
by Kirill
But Dragonfly seems to provide utf8-urlencode and utf8-urldecode. Great!

Re: Character mess when reading CGI-params on UTF-8 newLISP

Posted: Thu Oct 20, 2011 6:20 am
by Kirill
Confirming that replacing Web:url-encode with Dragonfly's utf8-urlencode solves the issue. It's not pretty, just cut and paste for now, so there is no pretty patch to submit.

For cgi.lsp solution would be similar.

Br,
Kirill