encoding url with chinese words meets an error

Q&A's, tips, howto's
Locked
xmftlg
Posts: 6
Joined: Thu Feb 21, 2013 2:24 pm

encoding url with chinese words meets an error

Post by xmftlg »

newLISP v.10.4.5 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for more info.

(define (url-encode str)
(replace {([^a-zA-Z0-9])} str (format "%%%2X" (char $1)) 0))

(url-encode "倒")

ERR: invalid UTF8 string in function char。

need help.

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: encoding url with chinese words meets an error

Post by Lutz »

You have to set the UTF-8 option for regular expressions:

Code: Select all

(define (url-encode str) 
    (replace {([^a-zA-Z0-9])} str (format "%%%2X" (char $1)) 2048))

> (url-encode "爱")
"%7231"
> $1
"爱"
> (char 0x7231)
"爱"
See here for all options: http://www.newlisp.org/downloads/newlis ... html#regex

xmftlg
Posts: 6
Joined: Thu Feb 21, 2013 2:24 pm

Re: encoding url with chinese words meets an error

Post by xmftlg »

much thanks to Lutz.

but in url:

(url-encode "爱")

should be %e7%88%b1

how to do that?

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: encoding url with chinese words meets an error

Post by Lutz »

Code: Select all

(define (url-encode str) 
    (join (map (fn (chr) (format "%%%02x" chr)) (unpack (dup "b" (length str)) str))))

(url-encode "所有的愛是公平的") 
;=> "%e6%89%80%e6%9c%89%e7%9a%84%e6%84%9b%e6%98%af%e5%85%ac%e5%b9%b3%e7%9a%84"
Ps: this and a url-decode for utf-8, you can now also find here:
http://www.newlisp.org/index.cgi?page=Code_Snippets

xmftlg
Posts: 6
Joined: Thu Feb 21, 2013 2:24 pm

Re: encoding url with chinese words meets an error

Post by xmftlg »

Thank you Lutz.
It works.

I should keep on learning NEWLISP.

ps: found in google :

https://github.com/kosh04/newlisp.snipp ... er/net.lsp

;; URL translation of hex codes with dynamic replacement
(define (url-encode url (literal ""))
(join (map (lambda (c)
(if (or (regex "[-A-Za-z0-9$_.+!*'(|),]" (char c))
(member (char c) literal))
(char c)
(format "%%%02X" c)))
;; 8-bit clean
(unpack (dup "b" (length url)) url))))

haven't test it.

Locked