A potential problem exists in handling UTF-8 string values in UTF-8 enabled versions of newLISP. In UTF-8 versions, the length function would return the number of bytes and not the number of UTF-8 characters in the string. Hence, your function would not return the correct number of characters from your string!
In such cases utf8len must be used. Since single-byte ASCII characters (0-127) are a subset of UTF-8, length function problems may not be noticed until a user has multi-byte (non-English) UTF-8 characters to process.
A further complication is that non-UTF-8 versions of newLISP do not include the utf8len function!
A possible solution is to use this "ambidextrous" strlen function in place of length:
Code: Select all
(define (strlen str) (if (= (& (sys-info 9) 128) 128) (utf8len str) (length str)))
But of course it fails to return the correct length if your users are trying to process multi-byte UTF-8 character strings on non-UTF-8 versions of newLISP. Like when dealing with UTF-8 html pages that include "fancy" left and right quotes in "English only" text.
A truly "simplified" or "efficient" version of your function may not be entirely possible depending on how robust you want your code to be - like in module code designed to run on all versions of newLISP.
-- xytroxon
"Many computers can print only capital letters, so we shall not use lowercase letters."
-- Let's Talk Lisp (c) 1976