Page 1 of 1

split xml stream

Posted: Fri May 11, 2007 8:00 am
by Dmi
Dealing with jabber protocol, I want to get the sequence of separate xml statements from continuous xml stream, which is coming from server.

I can net-receive a buffer, consists of one xml statement and a start part of tne next statement.... or I can get a partial single xml statement...

In common, to handle this, I want to get all closed statements from the string buffer and leave the last unclosed (if exists) in the string buffer for appending further net-receive results to it.

As I can see, xml-parse can get the stream of statements and also survives unclosed xml statements at the end of the stream. But how can I find the point where I must to split buffer into "already closed statements" and "not yet closed last statement" parts?

If there is no a standard way for this, probably it will be possible to extend xml-parse function so it will return some positioning informaition and also a signal that the final upper-level xml statement is unclosed?

Posted: Fri May 11, 2007 12:05 pm
by newdep
tricky task... I think you will endup doing it on the fly during streaming input..
The only problem is indeed that you need to count and track the < and </ inside the stream..not so easy i think, looks like it will become a full jabber protocol api implant finaly..

<start>
<here>
<and>
<end>
<here>
</here>
</end>
</and>
</here>
</start>

Posted: Fri May 11, 2007 2:40 pm
by Dmi
Hmmm... nice idea, newdep. I'll try it :-)

Btw, Lutz, what is the proposed way for newlisp to do this task? ;-)

Posted: Sun May 13, 2007 8:50 pm
by Lutz
Btw, Lutz, what is the proposed way for newlisp to do this task? ;-)
... similar to what Norman is proposing. Read the outer, start- and end- tags using conventional file operations, then use 'xml-parse' for the pieces inbetween.

I also have put a special call back function on my list of things to do for 'xml-parse' which would work similar to the handler function in 'net-eval' amd always be called on a closing tag.

Lutz

ps: I am preparing a longer response/comments post to Norman's poll items. Look for this tomorrow

Posted: Sun May 13, 2007 9:27 pm
by Dmi
The callback is a good idea!
The feature (possible in a future callback?) I want is the simple way to split input stream into a pieces, i.e. suppose I got two sequental packets:

Code: Select all

<a>something</a><b>

Code: Select all

someelse</b><c>.....etc.
and I don't want to wait after tab "b" will be finished now because of a long sequence.

So I want to know the point in the actual parsed buffer where the xml-parser has found the next closing tag.

The creating of my own parser, that will to do the partial work of the existing xml-parser and then using an existing xml-parser is confusing me a bit while I going to wite a code.

Btw, the same is about a "parse" function ;-)