Experimental app for optical character recognition (OCR).
Runs the Tesseract 3.00 open source OCR engine to find text in images captured by the device camera.
The purpose of this app is to demonstrate OCR running on an Android device. Conventionally, OCR is run using a flatbed scanner to scan printed pages of text. In contrast, running OCR on images captured by a smartphone/tablet camera on an Android device gives much lower quality--but interesting--recognition results.
The default single-shot capture runs OCR on a snapshot image that's captured when you click the shutter button, like a regular photo.
When the "continuous preview" checkbox is checked, the app shows a dynamic, real-time display of what the device is recognizing right beside the camera viewfinder. The continuous preview works best on a fast device.
A translation capability (powered by Google/Bing) can be run after the OCR is finished. The translator results are not too useful on a practical level, though, because the errors in OCR are compounded by the machine translation.
Some notes on using this app:
- Hold down the on-screen shutter button to auto-focus, and release your finger from the button to take the picture. Or just tap the button to take the picture without autofocus.
- The same suggestions for effectively using the camera work as in other apps, such as Google Goggles or Google Docs: hold the camera steady, with the camera lens perpendicular to the word or characters you want to capture, and be sure the autofocus engages before taking the picture. Also, the OCR engine expects text to be approximately horizontal when the picture is captured in landscape orientation.
- To copy text to the clipboard, long-press on the recognized text or translated text.
- For recognizing individual Chinese characters, set the page segmentation mode to "single character."
- This app is a mash-up of several open source projects: the Tesseract OCR engine, Tesseract Tools for Android (tesseract-android-tools), the Zxing Barcode Scanner, Google-api-translate-java, Microsoft-translator-java-api, and a grad school class project. Language data downloads use files in the tesseract-ocr project at http://code.google.com/p/tesseract-ocr/downloads/list.
- Supported languages for OCR: Bulgarian, Catalan, Chinese (Simplified), Chinese (Traditional), Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Polish, Portuguese, Romanian, Russian, Serbian (Latin), Slovak, Slovenian, Spanish, Swedish, Tagalog, Turkish, Ukrainian, and Vietnamese.
- Thanks to the contributors: Spoorthi, Hunvil, Jingjing, Xuyuan, and Mandar.
@Phillip - Yes--I plan to release the source code in a few weeks. I'm working on commenting the code and finding how to best handle the project dependencies. Not sure if I should just check in the referenced JARs into the repository or what. There will also be a separate repository for my fork of tesseract-android-tools.