Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

encodings

Conversion between corpus and native encoding.


Description

Utility functions to convert encoding between the native encoding and the encoding of the corpus.

Usage

as.utf8(x, from)

as.nativeEnc(x, from)

as.corpusEnc(x, from = localeToCharset()[1], corpusEnc)

Arguments

x

the object (a character vector)

from

encoding of the input character vector

corpusEnc

encoding of the corpus (e.g. "latin1", "UTF-8")

Details

The encoding of a corpus and the encoding of the terminal (the native encoding) may differ and evoke strange output, or wrong results if no conversion is carried out between the potentially differing encodings. The functions as.nativeEnc and as.corpusEnc are auxiliary functions to assist this. The functions as.nativeEnc and as.utf8 deliberately remove the explicit statement of the encoding, to avoid warnings that may occur with character vector columns in a data.table object.


polmineR

Verbs and Nouns for Corpus Analysis

v0.8.5
GPL-3
Authors
Andreas Blaette [aut, cre] (<https://orcid.org/0000-0001-8970-8010>), Christoph Leonhardt [ctb]
Initial release
2020-09-22

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.