Hacker News new | past | comments | ask | show | jobs | submit login

The perfection in those examples makes me suspect that they are cherry-picked or part of the training data. Especially the handwritten text is not always clear and could reasonably be interpreted differently. I'd expect a machine-learning model to get at least some things wrong some of the time.

If I wanted to use this in an application, I'd definitely want to see some accuracy figures on validation data as well as a few failure cases to see whether the output remains reasonable even when it is wrong.




The examples are actually very simple compared to a lot of crazy stuff Mathpix can recognize so it's an honest representation of its capabilities. Mathpix is built for perfection because 99% isn't good enough.


> the handwritten text is not always clear and could reasonably be interpreted differently

Digital pen input contains more info than the resulting bitmap; strokes are lost while rasterizing.

That info was the reason how old devices were able to reliably recognize characters written by a stylus. It worked well even on prehistoric hardware, such as 16MHz CPU + 128 kB RAM in the first Palm PDA.


In the OCR world this is known as offline OCR (OCR on the bitmap) vs online OCR (strokes information).

Offline is way harder than online.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: