Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Really late reply so you'll probably never see this but thanks for taking the time to explain.

I was actually wondering a) what, exactly, the ISO was specifically doing, which is kind of a stupid question :) and b) how to, uhh, "un" the "exactly", I'll word it that way.

Hiding things in invisible objects would be fairly easy to detect. I was wondering if maybe the document might for example embed two spaces every X characters by way of identifier, or use a seeded RNG to pick from multiple visually-identical layout methodologies, or even maybe reencode the images with a uniquely-seeded JPEG scan script, oh oh or maybe adjust individual control points and Bezier curves in the glyph tables, or...

I'd probably just do an outline-to-shape or similar type of pass on it. But then I'd start wondering about the statistical probability of recovering glyph offset micro-adjustments, or hinting settings... eep.

Okay, import the whole PDF into a layout engine then re-export it. Hmm, what if... oh you know they might be reordering the paragraphs in the text... hmm, with 100 discrete text permutations, you could tell 10,000 output documents apart if all 100 permutations were left undisturbed and were recoverable. That's... quite a lot of work. They're probably not doing that.

Or are they?



I know there's a little personal identifier printed vertically on the bottom left of every page. Whether there are additional, steganographic watermarks, I don't know. It'd be interesting, but that'd require for at least two people to throw enough money at ISO to get a diff between the PDF outputs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: