Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

ocr

Image Text OCR


Description

Extract text from an image using the tesseract package.

Usage

image_ocr(image, language = "eng", HOCR = FALSE, ...)

image_ocr_data(image, language = "eng", ...)

Arguments

image

magick image object returned by image_read() or image_graph()

language

passed to tesseract. To install additional languages see instructions in tesseract_download().

HOCR

if TRUE return results as HOCR xml instead of plain text

...

additional parameters passed to tesseract

Details

To use this function you need to tesseract first:

install.packages("tesseract")

Best results are obtained if you set the correct language in tesseract. To install additional languages see instructions in tesseract_download().

See Also

Examples

if(require("tesseract")){
img <- image_read("http://jeroen.github.io/images/testocr.png")
image_ocr(img)
image_ocr_data(img)
}

magick

Advanced Graphics and Image-Processing in R

v2.7.2
MIT + file LICENSE
Authors
Jeroen Ooms [aut, cre] (<https://orcid.org/0000-0002-4035-0289>)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.