Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wouldn't know how to write this bug...

It should serve as a testimony on how hard it is to implement Office Open XML. Not even Microsoft can get it right.



Note that nobody implements Office Open XML as in the ISO standard.

Microsofts implementation is incompatible to what ISO specified, mainly because in the ISO standardization process, ISO dropped some of the byzantine stuff in OOXML. Microsoft never adjusted its implementation to that, so AFAIK today there is no implementation of the standard in existence.


Which is what is so frustrating about major corporations and their approach to the law. Or rather how the law is enforced.

Small deviations from speed limits are not particularly harmful, but small deviations from the standardisation process here or breaching insider information rules in banks make a mockery of the system. They can pass the infractions off as unavoidable incompetence, but really the system should not except these specious half-truth excuses in these circumstances and should come down much harder, sooner and more often.


"the law"?


Governments often require "standards" in their purchases for very good reasons, like not requiring all citizens to purchase a proprietary solution to interact with them or getting lower prices thanks to competition. If vendors claim to be delivering standards, but aren't really then it's not much different from selling devices or services that don't meet the requirements. Obviously there is a line where incompetence becomes fraud.


It easy to say that, but the practical realities are different.

This explains it better than what I can write:

http://www.joelonsoftware.com/items/2008/02/19.html


I like how Joel explains the necessary reasons why the file formats in question have to be so bad, while tacitly admitting that having a good reason for being bad doesn't make bad code any more useful.


The takeaway for me is that designing, developing and maintaining a full featured office suite is incredibly hard. And standardizing it is even harder. Look at all the teething problems that ODF had.

OO.org/LibreOffice doesn't even conform fully to the ODF standard. See issues like http://www.zdnetasia.com/ooxml-expert-odf-standard-is-broken...


This might be a funny quip, but I disagree. Implementing anything is hard, and weird bugs creep in. I've never seen any program without bugs. So it's pretty disingenuous to take one bug in Word and say that it shows there's something wrong with Open XML.


> Implementing anything is hard

Come on. Implementing something you invented should be easy. I would understand if the [Open|Libre]Office folks got it wrong, but Microsoft? The same company that basically discredited ISO (and badly damaged its function afterwards) in order to standardize this monstrosity? To not even bother to implement its botched bogus standard correctly is beyond insulting.

And yes, of all things wrong in MS Office Open XML, the bugs are the least important.


Come on. Implementing something you invented should be easy.

That's absurd. So you're saying there's no Apple bugs in Quicktime or Cocoa? You're saying there's no bugs in Emacs that Stallman wrote? You're saying that there's no bugs in Mathematica written by Wolfram? You're saying that there's no bugs in Java produced by Sun? You're saying that Ken Thompson wrote no bugs in Unix. Stroustrop wrote no bugs in C++.

I've never seen a non-trivial program, standards-based or not, that is bug free, period. Not one.

Heck, there's a 30 year old bug in binary search that largely went unnoticed -- even Donald Knuth missed the bug!

Bugs happen in trivial programs. Any non-trivial program will have bugs.

This is completely insincere. Unless you're willing to say the same thing about ODF and virtually every other file format that exists, since I can find bugs implementing just about all of them from their core proponent.


Get used to it. Regardless of the technical merits, it's cool to hate on MS and blindly support Apple/Google on Slashdot, Reddit and even more so on HN. I've seen people quit HN in disgust because of the arguments, comments and moderation of Apple fans on here.


There's a bug somewhere in Open/LibreOffice where an old ODT file opened in a new version loses spell checker support, and no tinkering with dictionaries will fix it. It's also carried through a copy and paste.

And it happens in Word 2010 when I try to open the same ODT file.

Bugs happen.


> Bugs happen.

But one could assume they could, at least, implement correctly something they invented.


This comment really make me wonder if you have coded on a large project with a significant group of people. Building an iPhone app is nearly impossible for a single coder to get 100% right, where they have massive control of everything. Add the heterogenous environment of Windows and dozens of programming groups trying to come together, and it is nonsensical to say "they could, at least, implement correctly something they invented."

Software is way harder than that.


I think Microsoft's view of software quality has contaminated the industry. If you can't build your own spec correctly, maybe that's because you got overambitious with it.

I find it ludicrous that Microsoft could write the spec, find the resources to corrupt the process at ISO, discredit a valuable institution, cripple it by inflating membership with members that don't participate on any other issues in order to promote a standard that aims to be impossible to implement by third parties and be unable to command the resources required to implement it correctly in the first place.

Didn't they have a reference implementation for the standard in the first place?!


Why hold Microsoft to standards (no pun intended) that no one else can meet? Documents produced by Office validate better against the OOXML transitional spec than documents produced by OpenOffice validate against the ODF spec.

If the ISO process had followed its normal course, the final OOXML spec would have been close to what went in, with a few fixes. Instead, IBM and a few others tried at every step to stop the process, and if they couldn't stop it, they pushed through significant changes to the spec. In effect they changed the process from standardizing (with some cleanup) an existing format into writing a new format.


So you mean Microsoft is to blame for Netscape 4.x sucking and crashing on every OS?

Office has code dating back to the 80s. Read more for a backstory http://www.joelonsoftware.com/items/2008/02/19.html


Who said anything about Netscape? No. It's not OK for Netscape to ship buggy software and it's certainly not OK for a 200+ billion dollar company to do so.

What's the excuse? They couldn't hire programmers to correctly implement their own spec years after it being published?


>Who said anything about Netscape?

You did, in a way. By saying this:

>>I think Microsoft's view of software quality has contaminated the industry.

>They couldn't hire programmers to correctly implement their own spec years after it being published?

Throwing bodies at something doesn't make it right in software engineering. Haven't you heard of bugs and issues in Google's or Apple's products? After all, Apple is a bigger company now.

I was going to link to some Apple bugs but Apple Discussions was down (cue 'why can't a 200+ billion company keep their forums up? Can't they hire more people to fix it?' )

Unable to add beyond 100 pages to document(maybe they need to hire more people to hit 200?) http://discussions.info.apple.com/thread.jspa?threadID=26879...

http://arstechnica.com/apple/news/2010/07/apple-looking-into...

http://www.youtube.com/watch?v=Pdk2cJpSXLg&feature=playe...


> in a way

You mean you interpreted as me saying something about Netscape.

> Throwing bodies at something doesn't make it right in software engineering

Usually no, but Microsoft can throw a lot of bodies and rebuild from scratch, this time with good engineering.

> Apple is a bigger company now.

I don't think so. It's just more valuable.


Rebuilding something like Office will take a decade and will likely have no benefits at all.

Rewrites rarely make sense at all. Just see how old some of the code thats in widespread use is. Android is built on Linux that started in 1991(ignoring that Linux was based on the even older Minix and it's Unix roots). OS X/iOS are based on BSD, Darwin and Unix which are quite old.

See http://www.joelonsoftware.com/articles/fog0000000069.html

>I don't think so. It's just more valuable. The point still stands.


> Rebuilding something like Office will take a decade and will likely have no benefits at all.

I think that allowing to launch better, faster, safer, leaner and stabler versions faster with less bugs and for less money, while, at the same time, uncovering and correcting bugs that have been in the codebase for decades would be a plus.

> ignoring that Linux was based on the even older Minix

That's good, because it was not. One could say it was inspired on Minix and Unix.

> OS X/iOS are based on BSD, Darwin and Unix

OSX is more closely related to NeXTStep

Your history lessons are failing you.


>I think that allowing to launch better, faster, safer, leaner and stabler versions faster with less bugs and for less money, while, at the same time, uncovering and correcting bugs that have been in the codebase for decades would be a plus.

Did you even read the link I provided?

Netscape killed itself in rewrite, MS almost did that with Vista before hitting reboot and starting over with old code and now you would want MS to undertake a super expensive rewrite? Things rarely work so ideally in the real world, especially for humongous feature/code bases like Office.


Yes, I did read your link. The relationship of NeXT to this discussion (about Microsoft's inability to correctly implement something they invented and that they want others to implement too - because it's a standard after all) escapes me.

Attributing Netscape's demise to a rewrite of the browser is a bit exaggerated. They were under enormous pressure with a company with more resources to spend monthly than their entire market cap and giving away a browser bundled with Windows. The pressure to deliver new versions made them cut corners and allowed the browser to accumulate an enormous quantity of kludges that culminated with the need to throw it out and restart from scratch.

If anything, they should have been rewriting from the start, never allowing the cruft build up. It's an investment that pays back more often than not, specially if you are under the pressure to deliver new features quickly.

BTW, your fixation with Netscape is interesting too. You brought it up and tried to reason I did. That's also something that escapes me.

Why would a full rewrite of Microsoft Office be so expensive? How much did Sun, Oracle and independent collaborators put in OpenOffice anyway? Microsoft has the resources for that. The reason they don't do it is because they don't need to.


That doesn't seem like a good thing to assume. Every invention has issues.


It's sad to see even otherwise knowledgeable people jump on bugs when it's Microsoft while Apple(eg. 3rd generation iPod Touch and iPhones getting superslow for months with iOS 4.0 update), and Google (all the crazy unfixed bugs in SMS like in the other article on the FP, contrast the comments there) get a free pass.

Managing the tens or hundreds of millions of code dating back to the 80s is not easy.

http://www.joelonsoftware.com/items/2008/02/19.html

When it's MS, the comments and moderation are always about malice and incompetence, even on HN.

I remember someone on Reddit calling HN an Apple fanboy club. I guess they're not that far from the truth looking at the comments and moderation for the articles here.


Why are you implying that Apple and Google being fallible makes it OK to ship buggy software? It's not OK. It happens, regrettably, and has to be corrected.

> Managing the tens or hundreds of millions of code dating back to the 80s is not easy.

It seems they should tackle an easier problem then. This one is, evidently, too hard for them.

> When it's MS, the comments and moderation are always about malice and incompetence

Incompetence, malice and... What would be the third explanation?

> I remember someone on Reddit calling HN an Apple fanboy club

It's been a long time since it's not.


It's next to impossible to ship without bugs for something the scale of Office. Your comments seemed to blame Microsoft, so I was giving examples of bugs in other products that have nothing to do with MS.

>Incompetence, malice and... What would be the third explanation?

Nothing really, but my point was about all the negativity in the comments and moderation when it's MS vs. many other companies. I don't like Microsoft but I don't think they deserve such a raw deal while other companies get a free pass for very similar issues.


"Come on. Implementing something you invented should be easy."

Sounds like someone who has never implemented something more complicated than a hello world.


Sounds like you don't know me.

In the past 25+ years I got my share of bugs. But having a correct implementation of a spec is kind of the only proof you can have the spec is complete and implementable.

Not having one is just sloppy.


Of course I don't know you, but with 25 years you should be in a position to recognize that a spec as complicated as this one is near impossible to implement 100% correctly. Even more, there is no way to tell if the spec is implemented correctly. Certainly when you actually have to ship something - you can't just spend 10 years polishing the implementation, testing millions of edge cases.


> a spec as complicated as this one is near impossible to implement 100% correctly.

I gather being impossible to implement by third parties was a design requirement that was accomplished. What I find surprising is they got carried away and made a spec they couldn't implement either.

> there is no way to tell if the spec is implemented correctly

That's why a correct open reference implementation is a must. You can always say that any corner cases can be resolved according to the code.

When you make your spec excessively complex you are just asking for trouble.

This spec, BTW, exists for the sole reason as to legitimate a Microsoft format as a standard competing against ODF. It exists not to be implemented correctly, but to fragment the marketplace, preventing the standardization of something Microsoft cannot control and use as leverage.


> This spec, BTW, exists for the sole reason as to legitimate a Microsoft format as a standard competing against ODF. It exists not to be implemented correctly, but to fragment the marketplace, preventing the standardization of something Microsoft cannot control and use as leverage.

ODF was never a feasible alternative. First, Sun retained veto power via the threat of patents over ODF. Sun's patent grant for standardization was limited to a particular version of the standard and any future versions whose standardization Sun participated in. If they wanted to derail attempts to take ODF in a direction that they did not approve of, they could withdraw from the committee leaving the standard unprotected from their patents.

Second, Sun made it clear that ODF was only going to have those features necessary to support Star Office. There were attempts to make it more general, so that it could be a universal format, but Sun squashed them.


Apart from the tin foil hat, I love the doublethink in this. So basically you're saying 'screw the spec, make an implementation and make that the spec'?


Make a spec and make a reference implementation. Is there something wrong about it? What do you propose? Make a standard nobody can follow?


Sitting around writing specs while the competition runs away with the market does not really help in many situations. Guess this was even more true in the 80s when Office was originally developed.

Choose between 'sloppy and shipped'(most software) vs. 'never ships because of targeting perfection' (GNU Hurd?).


Over-engineering increases the likelihood of bugs.

Simple and elegant = less bugs.

XML is the epitome of over engineering.


Of all the objectionable things about Microsoft's Office Open XML, the (mis-)use of XML is fairly far down the list.

Trying to create a bug-compatible version of .doc is hard, as the sterling efforts of OpenOffice and others have shown. An attempt to create one in XML, by the very people who were best placed to just document the original format is way beyond good or bad engineering.


They did publish documentation for the original formats (iirc around the same time they published the new XML formats), btw.


It was a few years later on, after some wrangling with the EU about their monopoly status.


Representing documents should actually be one of the uses XML is suited for, and was the original intent. Prime example: HTML


HTML predates XML by a couple years. HTML is losely based on SGML.


Where in the link supplied is there any evidence that this is a bug in the file-format? Before you jump on the bandwagon of cheap Microsoft-stabs, at least do some basic critical reading of the source you are relying on.


Do you have any other explanation?


Why is it so difficult to understand the difference between executable code and a 'standard' file format?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: