Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

registry_eval

Evaluate registry file.


Description

Functions to extract information from a registry file describing a corpus. Several operations could be accomplished with the 'cwb-regedit' tool, the functions defined here ensure that manipulating the registry is possible without a full installation of the CWB.

Usage

registry_get_name(corpus, registry = Sys.getenv("CORPUS_REGISTRY"))

registry_get_id(corpus, registry = Sys.getenv("CORPUS_REGISTRY"))

registry_get_home(corpus, registry = Sys.getenv("CORPUS_REGISTRY"))

registry_get_info(corpus, registry = Sys.getenv("CORPUS_REGISTRY"))

registry_get_encoding(corpus, registry = Sys.getenv("CORPUS_REGISTRY"))

registry_get_p_attributes(corpus, registry = Sys.getenv("CORPUS_REGISTRY"))

registry_get_s_attributes(corpus, registry = Sys.getenv("CORPUS_REGISTRY"))

registry_get_properties(corpus, registry = Sys.getenv("CORPUS_REGISTRY"))

Arguments

corpus

name of the CWB corpus

registry

directory of the registry (defaults to CORPUS_Registry environment variable)

Details

An appendix to the 'Corpus Encoding Tutorial' (http://cwb.sourceforge.net/files/CWB_Encoding_Tutorial.pdf) includes an explanation of the registry file format.

registry_get_encoding will parse the registry file for a corpus and return the encoding that is defined (corpus property "charset"). If parsing the registry does not yield a result (corpus property "charset" not defined), the CWB standard encoding ("latin1") is assigned to prevent errors. Note that RcppCWB::cl_charset_name is equivalent but is faster as it uses the internal C representation of a corpus rather than parsing the registry file.

Examples

registry_get_encoding("REUTERS")

polmineR

Verbs and Nouns for Corpus Analysis

v0.8.5
GPL-3
Authors
Andreas Blaette [aut, cre] (<https://orcid.org/0000-0001-8970-8010>), Christoph Leonhardt [ctb]
Initial release
2020-09-22

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.