using regular expression causes newlisp.exe terminated

Q&A's, tips, howto's
Locked
xmftlg
Posts: 6
Joined: Thu Feb 21, 2013 2:24 pm

using regular expression causes newlisp.exe terminated

Post by xmftlg »

Files in attachment are test.lsp b.txt c.txt

test.lsp:

Code: Select all

(set 's (read-file "c.txt"))
(println (find-all {(?s)target=_blank>(?:(?!target=_blank>).)*?在线观看_百度视频}  s ) )

(exit
)
while b.txt and c.txt are actually html source code.

E:\newlisp>newlisp
newLISP v.10.4.7 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for options.

> (load "test.lsp")

And newlisp terminated abnormal.

change in test.lsp:

Code: Select all

(set 's (read-file "b.txt"))
E:\newlisp>newlisp
newLISP v.10.4.7 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for options.


> (load "test.lsp")
("target=_blank>銆?em>鍟﹀暒鍟﹀痉鐜涜タ浜?/em>銆嬪姩婕紙2瀛e叏锛夐珮娓呭湪绾
胯鐪媉鐧惧害瑙嗛")

Now see the correct string.

in utf8 env the string is :
("target=_blank>《<em>啦啦啦德玛西亚</em>》动漫(2季全)高清在线观看_百度视频")

Testing it in v10.4.5 is the same result.

D:\newlisp>newlisp
newLISP v.10.4.5 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for more inf
o.

> (load "test.lsp") ;;read c.txt

D:\newlisp>newlisp
newLISP v.10.4.5 on Win32 IPv4/6 UTF-8 libffi, execute 'newlisp -h' for more inf
o.

> (load "test.lsp") ;;read b.txt
("target=_blank>銆?em>鍟﹀暒鍟﹀痉鐜涜タ浜?/em>銆嬪姩婕紙2瀛e叏锛夐珮娓呭湪绾
胯鐪媉鐧惧害瑙嗛")

can anyone help?
Attachments
test.zip
(50.11 KiB) Downloaded 149 times
Last edited by xmftlg on Sat Mar 23, 2013 6:49 pm, edited 1 time in total.

xmftlg
Posts: 6
Joined: Thu Feb 21, 2013 2:24 pm

Re: using of regular expression cause newlisp.exe terminated

Post by xmftlg »

I also try to increase the newlisp stack like :

E:\newlisp>newlisp -s 100000 test.lsp

E:\newlisp>newlisp -s 1000000 test.lsp

but seems change nothing.

Lutz
Posts: 5289
Joined: Thu Sep 26, 2002 4:45 pm
Location: Pasadena, California
Contact:

Re: using regular expression causes newlisp.exe terminated

Post by Lutz »

This is a problem in the PCRE library routines. See also here:
http://stackoverflow.com/questions/3613 ... lp-optimis

and here:
http://newlispfanclub.alh.net/forum/vie ... ash#p18722

On OSX this causes a crash, which occurs in pcre_exec(). It seems to have to do with nesting of HTML blocks.

xmftlg
Posts: 6
Joined: Thu Feb 21, 2013 2:24 pm

Re: using regular expression causes newlisp.exe terminated

Post by xmftlg »

Thanks Lutz.

Locked