I just had some curiosity in practical OCR software, and installed a bunch of related packages on the laptop I name ‘Klystron’, including ‘tesseract-ocr-xxx‘ packages. This batch of installations went on to include ‘gImageReader‘, an actual GUI with which images can be read.
However, this revealed a cute little application that was Okay, I guess, but that would crash whenever I exited it. The solution to this problem was to custom-compile the latest version.
The package manager offers version 2.92 under Debian / Jessie. The available, custom-compiled version is 3.1.91. It is Qt 5 -based.
In order to compile that, I first needed to custom-compile Qt5-Spell, because that, too, is not good enough as things ship from the package manager. And eventually, when first trying to compile Qt5-Spell, in the required Qt5 mode (it will default to Qt4 mode), I also ran into this peculiar error message:
Qt5LinguistTools Not Found …
This actually requires that we install the package ‘qttools5-dev‘, which can be confusing, because there is also a package named ‘qttools5-dev-tools‘, which may already be installed to no avail.
But, now that I have surmounted these problems, I have a more serious application installed, and one that does not crash, just when exiting.
Note: This application uses a hard-coded version of ‘tesseract‘, not the version which I installed from the package manager. Yet, I feel that having installed additional data files from the package manager, has also added languages which ‘gImageReader’ can read.
(Edit 06/05/2016 : ) It is also possible to install ‘gImageReader’, by adding the special repository to our sources list, which is being hosted by the application author, whose name is Mr. Sandromani.
$ sudo add-apt-repository ppa:sandromani/gimagereader
$ sudo apt-get update
$ sudo apt-get install gimagereader tesseract-ocr tesseract-ocr-eng
In this case, we are installing a binary from a 3rd party, and may also need to install the repository public key. This version uses the externally-supplied ‘tesseract‘ engine, but it is a customized version of ‘tesseract‘, which gImageReader is able to use. One point in installing the Debian version of ‘tesseract‘ could be, to have a version that is closer to what Google designed, and that can be invoked from the command-line, or from within other applications.