Character mess when reading CGI-params on UTF-8 newLISP

Q&A's, tips, howto's
Locked
Kirill
Posts: 90
Joined: Wed Oct 31, 2007 1:21 pm

Character mess when reading CGI-params on UTF-8 newLISP

Post by Kirill »

I'm using cgi.lsp to read POST following POST-ed parameters:

Code: Select all

text=тест
save-note=Save
Here's what cgi.lsp gives me (web.lsp behaves accordingly):

Code: Select all

(("save-note" "Save") ("text" "\209\130\208\181\209\129\209\130"))
тест
The output above is produced by this code:

Code: Select all

(println "Content-type: text/plain\n\n")
(println CGI:params)
(println (CGI:get "text"))
(exit)
newLISP version is this:

Code: Select all

newLISP v.10.3.4 64-bit on Linux IPv4/6 UTF-8, execute 'newlisp -h' for more info.
Maybe someone here could give me some pointers on how to get data as utf8 strings?

Regards,
Kirill

Kirill
Posts: 90
Joined: Wed Oct 31, 2007 1:21 pm

Re: Character mess when reading CGI-params on UTF-8 newLISP

Post by Kirill »

Seems neither cgi.lsp nor web.lsp is able to deal with multibyte characters in their processing of urlencoded data. URL decoding (and encoding) needs a tiny overhaul.

Kirill
Posts: 90
Joined: Wed Oct 31, 2007 1:21 pm

Re: Character mess when reading CGI-params on UTF-8 newLISP

Post by Kirill »

But Dragonfly seems to provide utf8-urlencode and utf8-urldecode. Great!

Kirill
Posts: 90
Joined: Wed Oct 31, 2007 1:21 pm

Re: Character mess when reading CGI-params on UTF-8 newLISP

Post by Kirill »

Confirming that replacing Web:url-encode with Dragonfly's utf8-urlencode solves the issue. It's not pretty, just cut and paste for now, so there is no pretty patch to submit.

For cgi.lsp solution would be similar.

Br,
Kirill

Locked