Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're inferring the structure of the document from the printed result. If typesetting takes a set of layout directives and outputs a page, this is taking a finished page and guessing what layout directives could create it. Then you can take that inferred structure and reflow the page in a new layout.


so like ocr but not recognizing characters and words but recognizing the layouted structure and transforming it into content markup and layout markup?


That's a way to view it!

The reason I'm not falling back on OCR is because the general case is full of things, like math equations and inset graphics/diagrams, that can't be OCR'd. The only robust way to deal with those is to treat them as graphical atoms: "this bounding box can be moved around, but should not be split up into pieces".




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: