Identifying strings with parse

For the Compleat Fan
Locked
Jeff
Posts: 604
Joined: Sat Apr 07, 2007 2:23 pm
Location: Ohio
Contact:

Identifying strings with parse

Post by Jeff »

When using the internal parser with parse, strings in the target are not distinguished from atoms:

Code: Select all

(parse [text](println "Hello world")[/text])
...results in:

Code: Select all

'("(" "println" "Hello world" ")")
The only way to check them would be contextually, which would be more difficult that is reasonable in newLISP (i.e. ml-style inference) or testing against the current symbol set. The latter has the disadvantage that if the string is equal to the string value of an existing symbol, it will not be identified as a string. It will also not be able to find other contexts without the application previously tracking context creation.

Can parse be modified to use newLISP's parsing rules but to identify strings correctly? Or perhaps identify ", {, }, [text], [/text] all as tokens?
Jeff
=====
Old programmers don't die. They just parse on...

Artful code

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Post by Lutz »

You can use 'find-all' with a huge regular expressions, e.g. like this:

Code: Select all

(set 'newlisp {\!=|\^|\(|\)|[a-zA-Z]+|\[text\]|\[/text\]})

> (find-all newlisp "(foo [text]hello world[/text]) (!= x y)")
("(" "foo" "[text]" "hello" "world" "[/text]" ")" "(" "!=" "x" "y" ")")
> 
You could add an optional expression to preprocess each token before it goes into the return list:

Code: Select all

> (find-all newlisp "(foo [text]hello world[/text]) (!= x y)" (println $0))
(
foo
[text]
hello
world
[/text]
)
(
!=
x
y
)
("(" "foo" "[text]" "hello" "world" "[/text]" ")" "(" "!=" "x" "y" ")")
> 
Instead of (print $0) you could use any other expression transforming $0 into something else, e.g. add a type number, etc. What goes into the list is the return value of that expression:

Code: Select all

> (define (xform) (upper-case $0))
(lambda () (upper-case $0))
> (find-all newlisp "(foo [text]hello world[/text]) (!= x y)" (xform))
("(" "FOO" "[TEXT]" "HELLO" "WORLD" "[/TEXT]" ")" "(" "!=" "X" "Y" ")")
> 

Jeff
Posts: 604
Joined: Sat Apr 07, 2007 2:23 pm
Location: Ohio
Contact:

Post by Jeff »

The goal is to avoid costly regexes. I'm trying to write some pre-processing code and too much regex matching would definitely hurt load times.
Jeff
=====
Old programmers don't die. They just parse on...

Artful code

rickyboy
Posts: 607
Joined: Fri Apr 08, 2005 7:13 pm
Location: Front Royal, Virginia

Post by rickyboy »

Hey Jeff,

Just guessing here, but if the text is lisp code, eval-string might help.

Code: Select all

(define (text2sexp text-lisp-exp)
	(eval-string (append "'" text-lisp-exp)))

(text2sexp [text](println "Hello world")[/text])
   ;; => (println "Hello world")
Then you can crawl the answer and discover that println is a symbol, "Hello world" a string. Hope that helps.
(λx. x x) (λx. x x)

Jeff
Posts: 604
Joined: Sat Apr 07, 2007 2:23 pm
Location: Ohio
Contact:

Post by Jeff »

No, what I'm doing is writing a run-pre-processor for true macros. See my other post about template expansion. Rather than running macros as a sort of lazily evaluating function, I am trying to use them more like CL- that is, as a way of writing larger pieces of code more tersely.

Rather than using letex, which doesn't have any way of expanding '(+ '(1 2 3)) into '(+ 1 2 3), I'm adding [*] and [**] and trying to kludge up the same effect as a common lisp back-tick expression. But I don't want to perform expansions inside of strings, so I was hoping to be able to identify them in parsed token lists, but apparently it does not *quite* use newLISP's parser, because newLISP can obviously identify strings.
Jeff
=====
Old programmers don't die. They just parse on...

Artful code

Locked