Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

tessdata

Tesseract Training Data


Description

Helper function to download training data from the official tessdata repository. Only use this function on Windows and OS-X. On Linux, training data can be installed directly with yum or apt-get.

Usage

tesseract_download(lang, datapath = NULL, progress = interactive())

Arguments

lang

three letter code for language, see tessdata repository.

datapath

destination directory where to download store the file

progress

print progress while downloading

Details

Tesseract uses training data to perform OCR. Most systems default to English training data. To improve OCR performance for other languages you can to install the training data from your distribution. For example to install the spanish training data:

On Windows and MacOS you can install languages using the tesseract_download function which downloads training data directly from github and stores it in a the path on disk given by the TESSDATA_PREFIX variable.

References

See Also

Other tesseract: ocr(), tesseract()

Examples

## Not run: 
if(is.na(match("fra", tesseract_info()$available)))
  tesseract_download("fra")
french <- tesseract("fra")
text <- ocr("https://jeroen.github.io/images/french_text.png", engine = french)
cat(text)

## End(Not run)

tesseract

Open Source OCR Engine

v4.1.1
Apache License 2.0
Authors
Jeroen Ooms [aut, cre] (<https://orcid.org/0000-0002-4035-0289>)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.