Remove word classes
This method strips off defined word classes of tagged text objects.
filterByClass(txt, ...) ## S4 method for signature 'kRp.text' filterByClass( txt, corp.rm.class = "nonpunct", corp.rm.tag = c(), as.vector = FALSE, update.desc = TRUE )
txt |
An object of class |
... |
Additional options, currently unused. |
corp.rm.class |
A character vector with word classes which should be removed. The default value
|
corp.rm.tag |
A character vector with valid POS tags which should be removed. |
as.vector |
Logical. If |
update.desc |
Logical. If |
An object of the input class. If as.vector=TRUE
, returns only a character vector.
# code is only run when the english language package can be loaded if(require("koRpus.lang.en", quietly = TRUE)){ sample_file <- file.path( path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt" ) tokenized.obj <- tokenize( txt=sample_file, lang="en" ) filterByClass(tokenized.obj) } else {}
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.