Hacker News new | past | comments | ask | show | jobs | submit login

They are missing the most important thing: AsmXml builds a specialized structure for a specific kind of XML. Regular parsers like Xerces-C build a generic, mutable DOM tree. That is obviously _way_ more work, with different lists for attributes and children, different node types etc.

Looking at the schemata they support, it doesn't look like they support all possible XML structures.

A better comparison might have been one of the XML binding tools vs AsmXml, that's much closer in functionality.

I'd also be careful wrt the XML-ness of this tool. Does it properly handle all Unicode obscurities? Does it handle all DTD obscurities? Is it robust against documents not matching the schema, malformed documents, or even malicious documents?




Those are exactly the questions I ask myself every time I hear about a new, super-fast XML parser. Is it a real XML parser[1] or can it only handle a "simple" subset of XML? I am surprised that more people don't seem to share the same skepticism as you and me. Probably because most people don't realize that implementing a conforming XML 1.0 parser is not a trivial task.

And you are right about the XML data binding tools. In most cases parsing XML itself is not what takes the bulk of the time. It is validation (against DTD, XML Schema, or ad-hoc), data conversion (e.g., from "123" string to 123 integer) and perhaps construction of some in-memory representation (with memory allocations that it involves) that take the bulk of the time. There are existing tools[2] that can performs the above tasks an order of magnitude faster than a general-purpose XML parsers by generating the tailor-made validation and data extraction code as well as data structures from XML Schema.

[1] http://www.codesynthesis.com/~boris/blog/2008/05/19/real-xml...

[2] http://www.codesynthesis.com/products/xsde/


A number of years ago, I wrote a XML/C binding tool atop the gnome SAX parser. It's able to handle anything in a DTD and a lot of things from XML Schema. I stopped working on it as it could do everything I had use for. http://xmel.sourceforge.net/ It does O(1) access to attributes as well, since it does binding.


> It's able to handle anything in a DTD and a lot of things from XML Schema.

Which would be great if those two weren't basically the worst schema languages available for XML.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: