println decimal string

Q&A's, tips, howto's
Locked
Thorstein
Posts: 7
Joined: Tue May 26, 2015 11:30 pm

println decimal string

Post by Thorstein »

Running on windows,

(exec)'ing Google Translate returns the following str inside a JSON container:

(a) "Nous allons habiller pour la randonnée, selon la météo."

However, somewhere in the process of a (string str "") or (replace x str y) the str begins to (println) as this:

(b) "Nous allons habiller pour la randonn\195\169e, selon la m\195\169t\195\169o. \n\n</td></tr>"

How can I convert (b) to (a) so I can (println) (a) to a static HTML file?

Do I have to make a unicode build?

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: println decimal string

Post by Lutz »

The "\195\169" portion of the string is just the way newLISP encodes a string when output is directed to a device or terminal not able to display UTF8 characters. The byte sequence 195 169 is the encoding for the UTF8 character é (unicode 233).

If you would do a (println "\195\169") in a UTF8 capable terminal - e.g. on OSX - you would see é.

The following code would create a usable page.html which correctly would show the accented é in a web browser:

Code: Select all

(write-file "page.html" (string
   {<htmL><head> <META http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head><body>}
   "\nNous allons habiller pour la randonn\195\169e, selon la m\195\169t\195\169o.\n"
   "</body></html>"
))
This is the page generated by above program, including the translation of the \n characters in two linefeeds for better looks for the HTML code.

Code: Select all

<htmL><head> <META http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head><body>
Nous allons habiller pour la randonnée, selon la météo.
</body></html>
Note, that the first string argument is limited with curly braces {,}, doing it this way, lets me include un-escaped quotes " in the string. Normally when using "..." to limit a string in newLISP, would have to escape special characters with a backslash like this \". For strings longer than 2048 characters you also can use [text]...[/text] tags as delimiters. All this is explained in the manual.

So the sequence \195\169 has nothing to do with println or string. It is just the special way to encode UTF8 characters in newLISP.

The above code to write page.html works also with non-UTF8 versions of newLISP, but if you do a lot of web work and in non-english languages, I recommend using the the UTF8 version of newLISP. This way you have a lot of string manipulating functions UTF8 aware.

Thorstein
Posts: 7
Joined: Tue May 26, 2015 11:30 pm

Re: println decimal string

Post by Thorstein »

Thanks, Lutz! I found the latest UTF8 build. That is doing the trick. (That and RTFM! :-/ ).

And many thanks for this great Lisp! (And for the great documentation.)

Locked