Continuing the discussion from What format should I store data in?:
Split this off from @Sandra’s comment in the data format thread because it’s really a worthwhile thing for us to talk about. (Both @nqemlen and @clriley might interested in this too.) Nick and I were making use of the OCR’d content available on the Internet Archive, which is surprisingly good for Old Spanish and Aymara, but as Sandra mentions, the Internet Archive’s actual software that does the OCR is closed source.
Hoping others will jump in here, but by way of getting the ball rolling I’ll mention one interesting online demo that I actually use on fairly frequently for small jobs, just because it’s so idiot-proof:
Have you used OCR to deal with legacy print materials? How did it go?
Thanks for bringing up the topic, Sandra!