Hacker News new | past | comments | ask | show | jobs | submit login
Reverse engineering yet another eBook format (mijailovic.net)
389 points by Metalnem on Dec 25, 2022 | hide | past | favorite | 86 comments



This is really an issue that should not exist. I have also bought textbooks only to be immediately disappointed by the fact that you have to use a specific app (like the adobe DRM thing... or just try to get your money back). Then you have to start figuring out how to remove the DRM and lose time with that. Sometimes the quality of the book is also pretty bad, e.g. they don't even include a digital table of contents, which should be essential in a digital copy of any book (this has happened to me with textbooks from Springer). In my case, this has lead to actually buying less textbooks and just pirate the books instead. I don't have the time to deal with all this customer-unfriendly behavior.


One of my textbooks came with a digital copy. I would sometimes just use that on my iPad instead of lugging in the physical book to campus. I remember racking my brain all day on a problem that asked you to prove some result, wondering how the hell that was true only to get home and realize the digital version flipped the inequality...


As a maths teacher who also have written a textbook, I have seen a drastic increase in errors in books compared to 5-10 years ago. I don't know why, maybe the scheduler is tighter or proofreading is now too expensive.


I've authored book chapters with Springer. I've used their LaTeX class and produced some nice looking chapters (imho).

They get "typeset" by SPI in Chennai by a non maths speaker, non-native English speaker and come back with errors in both the equations and words and author queries like "equation missing but we follow tex".

One professional mathematican I know in a suitably obscure area of number theory has started her own journal to get around it all. I'm in both the physics and medicine camps and have to put up with this bullshit. It's awful.


The last springer book I've had is so poorly written that there is absolutely no way it was proofread. It is not just one mistake here or there. It looks like a quick draft with incomplete sentences, repetitions all over. Looks like those "cheap" packt books.


Looks like someone sent "fundamentals_of_math-e4-v2-final-finalv3-final.docx" to the printer instead of "fundamentals_of_math-e4-v2-final-finalv3-final-final.docx"


I published a book with degruyter and there is no proofreading at all. No support to get e.g. the colours right for printing. They only give you some example books, a rough latex template and minor advice on content and audience. Once it is published they will ask you if you want to prepare a revised version. It was OK because e.g. my students get a free ebook version anyways via the university subscription and I guess it is ok for the rest to pay the overpriced but well printed book.


If your students get it for free why not make it open access on wikibooks or something like that? The only reason people in academia were publishing the books as books was for resume stuffing but that was a different domain.


Actually resume stuffing was the major reason (not to get a job but rather for positioning oneself). I wrote it in my free time actually (my income on the book is about a 10th of the time put into it). That it is free to the students is just a nice side effect. The target audience is decision makers in smaller companies. You do not reach them via wikibooks. I guess I would have never written it as open access, because in this case I would have left it to others.

I would like to see a peer review/editing community for wikibooks btw, that actually would lead to better quality of books. distill.pub eg IMHO is great.


From what I heard, it’s because Springer et al. hired editors in India to cut costs…



I have a $70 ebook (Amazon Kindle) of an academic work (not a textbook) and it is practically unusable. So many problems compared to the physical book. Never again.


I've found that a ton of older books published as ebooks were clearly just OCR'd and then not thoroughly edited. Definitely frustrating.


Handling complex material has never been a priority for the Kindle. To be fair, more recent iterations have improved support for things like tables and MathMl, but it’s still a work in progress.


With my adhd I learn programming drastically better on physical books. Stupid I know, but it helps drastically to see it on paper.

Sadly it’s either crazy expensive, or not even an option to get paper for newer frameworks


> Stupid I know, but it helps drastically to see it on paper.

Not stupid at all. Myself, and every single colleague of mine that I know (pure mathematicians) have to print every non-trivial document that we really want to study deeply.


I'm reminded of 'edw519, and his habit of printing out source code listings to work on without losing focus. See e.g. http://static.v25media.com/edw519_mod.html#chapter_49, and then CTRL+F for "print" from there.


I used to do this all the time in my early career. Stopped at some point.

Probably when I stopped working on really hard things and started working on complicated things that touch tons of files.

Going to try this again. It was really helpful


The biggest problem I have with it for textbooks is just how slow the fucking thing is to scroll between pages. Which is a key operation for a reference material.

I spent $100 on a textbook on kindle, then just pirated the PDF for this reason.


These are pretty poor excuses to cope with piracy


I think that the poor excuses are in the other side. Adding DRM to files to actively preventing the user to choose in which devices he/she wants to read the file and for how long the content will be available goes against the whole concept of "buying" a book. When I buy a book I expect to be able to open it in ten years in whatever device I'm using, running any OS that I want. That's why we have standard formats. And I also expect that the quality of a book for which I payed 50-90€ for (because that is normally the price range) to have a good quality and at least a table of contents. A few occasions I had to manually add table of contents for books that otherwise would have been unusable to me.


"buying a book". You're not, though. Nor a film/movie. It is, at least in the US, treated as something along the lines of a temporary contract to view. And so... DRM is used to as a tool of enforcement of the contract term's expiration.


Hence piracy is the better option.


Piracy is the moral option.


Yeah, that's the whole problem.


If legal options aren’t useable, the black market will fill the void. Once this is fixed, the dynamic swings back (like how torrenting became less popular since streaming services) You might find it immoral, but 99% of the world disagrees with you (e.g. ubiquity of illegal drug use worldwide)


Maybe my perception is off, but it seemed to me that torrenting really dropped off a cliff when Netflix was dominant, and had basically everything available on it.

Then, when the streaming service market blew up because the content owners wanted to make more money by only offering their content on their own service (Disney+, CBS all access, etc.), torrenting went back up because it was no longer inexpensive for a streaming subscription, since you needed a dozen of them to cover everything you wanted to see (with usually only one show on one service that you care about).


Yeah definitely, the pendulum swings based on consumer satisfaction. Hopefully streaming can be federated in some way and we can get past this phase.


I kinda doubt it: doing so would decrease profits, because you wouldn't, for instance, be able to pay a tiny amount to just watch one show on Disney+ instead of having to buy a full subscription.

Instead, it's going to be like cable TV and landline telephone service: something new will come along to attract people's attention, and they'll just keep pissing customers off, and as more and more leave, the service will get worse and worse, and the prices higher and higher, to take advantage of the suckers who refuse to leave, until the whole thing finally implodes.


Those web viewers for e-Books make PDF look great. There is nothing I dread more at the library for my uni than seeing a book is "in the collection" but is only visible through a web viewer. It's like trying to read the book through a shaving mirror.

(1) They try to slice the book into the thinnest possible salami so you are always click click clicking to navigate

(2) You can't download the book to read on the bus

(3) There is a general "evil" trend in the web to hide the content as much as possible. Cookie popups and "subscribe to our email newsletter" popups count, but hard-to-use navigation, <iframe> and friends all make their contribution. You'd think the people who make this stuff are paid more the less content is visible on the page.

(4) and of course it seems like the people are paid the more you have to struggle to log in.


Ye. I find it insane that it is faster to look in the paper book register and find the correct page then search their crappy ebooks. I have timed it. The ebook experience is abyssal.


> More importantly, allow me to retire from having to write the tools that bypass your restrictions.

FFS PLEASE! I waste so much of my life fighting DRM and anti-RE measures. It's so infinitely frustrating to fight against stuff that shouldn't exist in the first place. Someone wasted their time to waste my time and in the end we end up exactly in the same spot.


No, you don't wind up in the same spot. Most customers aren't like you, and don't waste their time fighting DRM and anti-RE measures: they just put up with the BS and pay through the nose for it. So the publishers profit handsomely, and the people they employ to write all that BS get paid too. The customers of all the ebooks get shafted, but they don't care about that, and they certainly don't care that they made your life harder so they could make more profit.


Yep. Unfortunately, the someone in question was paid in money, rather than respect, which puts food on the table for them.


I'm flabbergasted by a digital-only edition being $82. That's a price I'd find step even for print. I paid less than that for the print+digital combination of The Art of Computer Programming Volume 4B.

$82 and being forced into a time-limited web viewer. That's just... what's the market for that?


Textbooks are a lot more work to prepare than people think, and the market is in the thousands. Especially specialist textbooks like this require the input of many upper-middle class professionals who aren't exactly cheap - much like software.

I mean, how much time do you think it would cost to prepare a 400 page textbook on Amazon AWS, with curated examples and training exercises? How many man-hours of senior engineering talent would be required? How many copies could you realistically sell?

Some textbook authors have previously posted on HN that the money is terrible in technical books, and the biggest benefit has just been credibility that the authors can use to sell themselves as consultants.


The issue here is that publishers are raking up all the money and not even providing a good service anymore. The physical quality of my recent books, the formatting of the ebooks and the typesetting has gone downhill in the last 10 years or so. Books prices are still high, publisher profits are skyrocketing and quality of product goes down. It is time to setup a new way to support writters for publishing their books that doesn't involve those scammers. Not every writer can do their own typesetting and distribution.


20 years ago, most engineering textbooks were about $100 in the US. I discovered they sell them cheaper in Europe, and the cost to buy from Europe + shipping was still a lot cheaper.


Almost all the non-major online ebook viewers are like that.

Some of them would obfuscate the image resources, but that's about it.

Big players typically have something "better". Bookwalker (a Japanese ebook vendor) has the most complicated viewers in my experience, and it always convert all the text to image on server side before serving. (And it obfuscates the images on server side too, so you have to reverse this process yourself if you want to download them).


Reverse-engineering Japanese ebook DRM is much more fun than English. You run into all sorts of mistaken cryptography assumptions, fun exploits, obfuscation etc. Hint for bookwalker: check out the app


> it always convert all the text to image on server side before serving.

It probably easy to recover to absolute precision with OpenAI-like models. I bet using AI could help a lot with recovering books in the future.


BookWalker generally don't have exclusive rights to their books, so just buy from other place instead.

On books that they do have exclusivity, though, people are doing exactly that, OCR the book and recreate the EPUB.


People also have already cracked their Android client so you can straight up get EPUBs.

But it's not publically available since BW is very quick about patching their app once the crack is well known.


I am writing a book I intend to release commercially, and a good sized one at that (likely 1,500 pages+). I am absolutely stubbornly determined to release it as both print and DRM-free PDF.

It'll get pirated anyway (if anybody cares enough to do so), so why bother punishing paying customers?


> Downloading the files was super easy, barely an inconvenience

I love how the zeitgeist / popular culture allows this expression to escape the original context and become popular, but not a silly meme.


What expression? What was the original context?



I was curious if there was an older history. A Google Book search for: "Super Easy" "Barely an Inconvenience" indeed found nothing before about 2018, and then a few books use that combined phrase.

Most intriguingly, one is in Shatner's 2022 book "Boldly Go: Reflections on a Life of Awe and Wonder"

> Two of my three daughters live within a couple of miles of my house, while the other lives about fifty miles away - super easy to visit, barely an inconvenience.

Either he's really plugged into YouTubers or I can't help but think it comes from a ghostwriter.


I probably would have stopped at the pdf level because I’m lazy. Thanks for not being me, following through with your hatred, and doing a short write-up!


He was clearly not happy with owning nothing.


This reminds me of a similar experience I had, over a decade ago now, with RE'ing Flash-based eBook viewers and converting their often simply-a-wrapped-SWF formats back into something (relatively) sane like PDF. Those who have eBooks that are either a single SWF or a collection of them will know what I mean.

"does not allow for you to download your book"

Besides noticing the odd use of "your" there, if you see that sentence and think "challenge accepted", it means you still have the old-school "hacker spirit".


> Reverse engineering yet another eBook format

> Surprise, surprise, our website was using one of the most popular ebook formats!

:-/


Can anyone tell me the status of Kindle DRM? I want to buy some kindle only books but don't own a kindle. Is this even doable?


There are plugins - a search will find them - you can use with Calibre to remove DRM.

https://calibre-ebook.com/


I don’t think there is an easy way to DeDRM the latest kindle format used by the Mac/pc and iOS reader, and there is no easy way to download all kindle books in your account for “transfer to device via usb” for e-ink devices all at once. I’m trying to archive my 4k or so kindle books (already did a similar number of audible using openaudible) because I don’t trust Amazon to preserve access to them indefinitely.


You can however still download and use an older version of the kindle program (at least on windows), which download the files with the old DRM scheme.

It's one by one though.


Anecdotally, I've become aware that the Windows Kindle installer version 1.17 works fine for downloading your paid-for Kindle books and format-shifting them or making a personal back-up in a more open form. That information was current about a year or so ago; hopefully that version has not been hobbled since. Also, disable auto-updating.

I strongly suggest you only use it for your own purchases. Authors need to heat and eat too.


Kindle 1.26 works as of a week ago, if you also install the KFX Input in addition to dedrm. Best to look at the readme for such tools to get the latest info in any case.


You can find older Kindle versions for Mac as well without too much effort.


Didn't work very well on M1 mac with latest OS when I tried but I can probably run it on x86 somewhere.


I’m away from my M1 Mac at the moment so I’m not sure which older version of Kindle I ended up installing, but it worked fine to download my books, which was all I needed it to do.


I looked and I'm running 1.30.0. It wants to update every time I start it, but I just say no and it runs (slowly) but fine.


QEMU might be an answer. Not a sexy one, but DRM has never been a sexy issue.


This is indeed how I have read books, recently. I installed an old Kindle app on the PC, and configured it with my Amazon account. I download the books through the app, then start Calibre with the deDRM plugin which will find my private key directly in the Kindle directory, and convert the books to epub. I then send them to my Kobo reader via USB.


Not the answer you seek, but Kindle has apps for major platforms and a web based viewer.


The problem is that the latest versions of the Amazon reader for PC and Mac don't download the books in a format compatible with Calibre's DeDRM scripts. This is compounded by Amazon flipping off the choice to always update the software automatically (yes, it's stated as "always update automatically" if checked), so the first thing you have to do when starting up is go into settings and make sure the checkbox is cleared.


The web based reader will refuse to display some books, unfortunately.

In my case this was the sort of book than can be only realistically read on a big screen tablet, not on a Kindle.

Well, I've shrugged and downloaded the pdf from libgen


Still useful - thanks!


This isn't a new ebook format. This is just a web viewer for epub files.


> My old instincts kicked in and I decided it would be more fun to hack and blog for a week than to waste any time dealing with the customer support.

Don't get me wrong, I like reverse engineering as much as the next guy. But if you really are annoyed by such DRM tactics, you _should_ deal with their customer support to communicate your frustration. Otherwise, nothing will ever change.


Since when has complaining to customer support as a retail consumer ever changed anything? Retail can cause change by using third party media channels and, by some miracle, generating enough attention that it makes business sense to respond.

Enterprise customers are a different story.


If they get enough angry expletive-laced rants about how horrible their system is, maybe they'll pass that feedback on. And you get a good catharsis out of it.


You're just wasting your breath, and the CS people don't care.


No. This stuff is intentional and the reader isn't the customer.


Extract the file and then complain to support for a refund.


> "Access Duration: 84 Months"...

> Despite all the warning signs, I went ahead and bought the ebook...

> ...allow me to retire from having to write the tools that bypass your restrictions.

You either agree to the rules, or you're breaking them. If you're breaking the rules, authors won't get compensated and we won't see other Human Kenetics books. This should be illegal.

Personally, I would love to pay hundreds for a limited access to books I would like to read. But folks aren't inspired too much about writing books today that can make them a living, since there are pirates everywhere. They want free content, because "information should be free". No, it shouldn't. Pay for it.


This is a ridiculous take.

There were no such restrictions on books after you legitimately purchased them for hundreds of years now.

If you buy the hardcopy of the very same book, no such restriction would apply.

Rip-off schemes, where the publishers uses DRM to cripple the reading experience by providing the content via a proprietary reader with bad UX, is one of the very reasons people resort to pirating in the first place.


What is ridiculous take? Authors want to sell access to information they've collected, improved, filtered, and so on. They invested tremendous amount of effort into making this available to you for a very low price (less than $1/month). The sharing is implemented the certain way, super inconvenient - but you have been informed multiple times in advance, and accepted that.

> There were no such restrictions on books after you legitimately purchased them for hundreds of years now.

For hundreds of years there were no book pirates, and computers. Even libraries buy books so they can share to a limited audience.


Book piracy is as old as copyright. There were presses putting out editions unauthorised by the Stationers' Company when they held a legal monopoly in England, and the US was a famous hotbed of book piracy throughout the 19th Century. Dickens, amongst others, was not happy.


> For hundreds of years there were no book pirates

What did monks do if not copy books?


I fear an old sci-fi scene I imagined many moons ago could come true one day in the distant future: our civilization is gone and the only information left is stored in what remains of the Internet, so alien explorers tap into it to gather all information they could about us, but save for a few free books, everything else is locked behind tight DRM, and they finally are forced to leave failing to preserve any knowledge about us.

I can't but think what would have happened to our knowledge of some ancient languages had the Rosetta Stone scribes applied DRM to their texts.

https://en.wikipedia.org/wiki/Rosetta_Stone


You seem to have missed the part where he paid for the book. He did pay for it, he just wanted to be able to use it in a more convenience way.


> Personally, I would love to pay hundreds for a limited access to books I would like to read.

… Why? I'd assume that this was sarcastic, except that the sarcastic reading seems to support the post with which you seem to be arguing, and the rest of the paragraph:

> But folks aren't inspired too much about writing books today that can make them a living, since there are pirates everywhere. They want free content, because "information should be free". No, it shouldn't. Pay for it.

seems to be sincere.


>If you're breaking the rules, authors won't get compensated and we won't see other Human Kenetics books. This should be illegal.

so second-hand bookshops should be illegal too?


I'm sad this was downvoted. I think it would make a more interesting discussion for unpopular opinions to be present and engaged with rather than downvoted into invisibility.

I think you make a fair point, if you don't want to agree to the terms of a transaction, don't participate in the transaction with the intent of breaking the terms. This is the implied covenant of good faith and fair dealing in any contract.


If the seller had good faith, they wouldn't use DRM. Since they operate in bad faith, there's nothing wrong with pirating their stuff.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: