[Adium-devl] O'Reilly XML blog article: Parsing XML… backwards?

Peter Hosey prh at boredzo.org
Wed Mar 14 06:48:53 UTC 2007


Found this in my referer logs:

http://www.oreillynet.com/xml/blog/2007/03/parsing_xml_backwards.html

It's an article about LMX and the various ways it's a Bad Idea. Some  
are better than others, but anyway, the article is definitely worth a  
read. Also, I have a comment in there.

He makes a good suggestion:

> You can write multiple well-formed XML documents to a single file,  
> following each one by a binary trailer that gives the size of the  
> last chunk of XML. Then it is trivial for code to jump backwards  
> through the file, grabbing a little document each time and passing  
> it to a real XML parser.

This is an interesting idea. It would, essentially, be an archive of  
mini-XML-documents (which I suppose would be a bit like Colloquy's  
envelope element), which we could easily seek in reverse.

The downside is that it wouldn't work well with most existing XML  
tools—we couldn't simply slurp a log file and pass it to NSXMLParser,  
WebKit, or anything else, without preprocessing it to remove those  
size markers. OTOH, it wouldn't be terribly hard to write such a  
preprocessor. XSLT could do the job.

The other downside is that we already have ULF and LMX; this would be  
yet another log format, whose main reason for existence would be the  
fact that LMX won't work 100% of the time with XML from the sort of  
people who name their elements “hello--”.

I'm inclined to stay with ULF, but I wanted to bounce it off you guys.
________________________________
\ Peter Hosey / prh at boredzo.org
PGP public key ID: C6550423 (since 2007-01-01)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
URL: <http://adium.im/pipermail/devel_adium.im/attachments/20070313/32ef5281/attachment.sig>


More information about the devel mailing list