Hacker News new | past | comments | ask | show | jobs | submit login

I don't think what makes PDF an 'unfortunate' format for (1) editing, (2) on-device reading, and (3) extraction of semantic information (as opposed to presentational information) is any sin on Adobe's part nor 'bloat.'

It's a page description format, not a data format, so all its decisions follow from the need to ensure that you and I can both print the same 'page' even if we use different operating systems, software, printers, exact paper dimensions, etc. I suspect the main reason it holds on so well is that so many things operate in a document paradigm, where 'document' means 'collection of sheets of paper.' Everything from the After-Visit Summary from the doctor, to your car registration document already has a specific visual representation chosen to allow them to fit sensibly and precisely on sheets of paper.

Could HTML (say, with data URLs for its images and CSS so that it can stand on its own), or ePub be a better format in most ways? Sort of, but it is optimized for such a different goal that if you went in to evangelize that switch to everyone who makes PDFs today, you'd be met with frustration that the content will look a bit different on every device, and that depending on settings, even the page breaks would fall differently.

Relatedly, it's interesting to me that even Google Docs, which I suspect are printed or converted to PDF far less than half the time, defaults to the "paged" mode (see Page Setup) that shows document page borders and margins, instead of the far more useful "Pageless" mode which is more like a normal webpage that fits to window and scrolls one continuous surface endlessly.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: