Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> And I bet that MS intended such a complicated format to prevent Open Source Projects from developing parsers and MS from losing market share this way.

It wouldn’t surprise me at all if it simply was “the XML schema mostly follows how our implementation represents this kind of stuff”.

The source code of MS Word almost certainly has lots of now weird-looking design choices based on having to run in constrained memory. It also has dark corners for “we released a version that did this slightly different, so we have to keep supporting it”





> It wouldn’t surprise me at all if it simply was “the XML schema mostly follows how our implementation represents this kind of stuff”

That’s exactly what it was. They originally had a binary representation (.doc) which was pretty much just a straight-up dump of their internal data structures to disk. When they felt forced to make an “open” “xml-based” format, they basically converted their binary serialization to XML without changing what it represented at all. It was basically malicious compliance.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: