OCR text extraction
Perform OCR text extraction. This requires you have the tesseract
package.
pdf_ocr_text( pdf, pages = NULL, opw = "", upw = "", language = "eng", dpi = 600 ) pdf_ocr_data( pdf, pages = NULL, opw = "", upw = "", language = "eng", dpi = 600 )
pdf |
file path or raw vector with pdf data |
pages |
which pages of the pdf file to extract |
opw |
string with owner password to open pdf |
upw |
string with user password to open pdf |
language |
passed to tesseract to specify the languge of the engine. |
dpi |
resolution to render image that is passed to tesseract::ocr. |
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.