Hacker News new | past | comments | ask | show | jobs | submit login

I worked for the Harvard library for a while (as a part time student job) examining the quality of a lot of these Google scans in ~2005.

I’m frankly not very impressed by Google’s scanning process, OCR, image detection, de-warping, contrast adjustment, general QA/QC, etc.

Quality is variable, peak quality is mediocre, and the results are largely useless for anything but text.

Like a lot of other parts of Google, it’s a case where they tried to cheap out on trained human labor, and make it up with algorithms.

On the upside, even a mediocre book scan is better than nothing.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: