Page 1 of 1

Bug? dots in starts/ends-with

Posted: Sat May 26, 2007 11:05 pm
by newdep
Hi Lutz,

We have a "DOT" bugging around in starts-with and ends-with.. ;-)

Something realy funny is going on here...

> (starts-with "ohno!" "oh" 1)
true

> (starts-with "ohno!" "oh." 1)
true

> (starts-with "ohno!" "Yes|oh." 1)
true

> (starts-with ".ohno!" "." )
true

> (starts-with ".ohno!" "." 1)
true

> (ends-with "ohno." "." 1)
nil

> (ends-with "ohno." ".")
true



Norman.

Posted: Sat May 26, 2007 11:44 pm
by Lutz
the dot in a regular expression means 'any character'. Except for the second to last one where I am not sure, they are all correct.

Lutz

Posted: Sat May 26, 2007 11:50 pm
by Lutz
If you are checking for a dot as the last character with a regular expression you would do:

Code: Select all

(ends-with "ohno." "\\." 0) => true
Lutz

Posted: Sat May 26, 2007 11:52 pm
by newdep
aha oke... Then i think the documentation should be a little adjusted from the
string part on starts-with and ends-with..

because i was under the impression that only for lists the regex where possible..
and that the "....|....." was a hardcoded "or"..

but indeed this one bothers me and thats the one im fighting all day now..

> (ends-with "ohno." "." 1)
nil


Norman.

Posted: Sat May 26, 2007 11:59 pm
by newdep
Aggg those regex kill me... ;-)

double \\ man...i was hitting \. all day...


Its time for some logic inside regex ;-) (for the none regex manual reading kind of programmer)

Posted: Sun May 27, 2007 12:14 am
by Lutz
... but there is indeed a problem, which will be fixed in 9.1.7.

As a workaround when using regex in 'ends-with' always anchor the regulare expression to the end:

Code: Select all

(ends-with "onhno." ".$" 1) => true
now it works correctly

Lutz

Posted: Sun May 27, 2007 9:28 am
by newdep
Mmmm its not the solution.... the '.$" removes everything from my lists ;-)
Ill have to do it differently for now...

Thanks...

Norman.

Posted: Sun May 27, 2007 10:48 am
by Lutz
if you want to detect an ending dot you really should use:

Code: Select all

(ends-with xyz "\\.$" 0)
and not

Code: Select all

(ends-with xyz ".$" 0)
which would fire on any string in xyz

The anchoring bug is fixed in 9.17.tgz but don't want to relase until the GUI stuff is done in a few days. If this is an urgent problem I can release the 9.1.7 version earlier. But including '$' at the end of the regex string really should take care of your problem.

what regex pattern are you looking for? perhaps we can help you there?

Lutz

Posted: Sun May 27, 2007 10:56 am
by newdep
you are early awake ;-)

I can life with the '." for now I do some manual cleaning on the list
every now and then... Im doing some webstatistics ;-) durrently the
list is between 150.000 and 80.000 entry's and the regex im using is
cleaning data.. its oke for now.. Ill hope to finetune and release this tool
(yes it becoming a tool ;-) in GUI format...;-)

Norman.