Decode Structural Attribute.
Get data.frame
with left and right corpus positions (cpos) for structural
attributes and values.
s_attribute_decode(corpus, data_dir, s_attribute, encoding = NULL, registry = Sys.getenv("CORPUS_REGISTRY"), method = c("R", "Rcpp"))
corpus |
a CWB corpus |
data_dir |
data directory where binary files for corpus are stored |
s_attribute |
a structural attribute |
encoding |
encoding of the values ("latin-1" or "utf-8") |
registry |
registry directory |
method |
character vector, whether to use "R" or "Rcpp" implementation |
Two approaches are implemented: A pure R solution will decode the files directly in
the directory specified by data_dir
. An implementation using Rcpp will use the
registry file for corpus
to find the data directory.
A data.frame
with three columns. Column cpos_left
are the start
corpus positions of a structural annotation, cpos_right
the end corpus positions.
Column value
is the value of the annotation.
a character vector
registry <- if (!check_pkg_registry_files()) use_tmp_registry() else get_pkg_registry() Sys.setenv(CORPUS_REGISTRY = registry) # pure R implementation (Rcpp implementation fails on Windows in vanilla mode) b <- s_attribute_decode( data_dir = system.file(package = "RcppCWB", "extdata", "cwb", "indexed_corpora", "reuters"), s_attribute = "places", method = "R" )
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.