Import custom corpus data
Read data from a custom corpus into a valid object of class kRp.corp.freq
.
read.corp.custom(corpus, caseSens = TRUE, log.base = 10, ...) ## S4 method for signature 'kRp.text' read.corp.custom( corpus, caseSens = TRUE, log.base = 10, dtm = docTermMatrix(obj = corpus, case.sens = caseSens), as.feature = FALSE )
corpus |
An object of class |
caseSens |
Logical. If |
log.base |
A numeric value defining the base of the logarithm used for inverse document frequency (idf). See
|
... |
Additional options for methods of the generic. |
dtm |
A document term matrix of the |
as.feature |
Logical,
whether the output should be just the analysis results or the input object with
the results added as a feature. Use |
The methods should enable you to perform a basic text corpus frequency analysis. That is,
not just to
import analysis results like LCC files,
but to import the corpus material itself. The resulting object
is of class kRp.corp.freq
,
so it can be used for frequency analysis by
other functions and methods of this package.
An object of class kRp.corp.freq
.
Depending on as.feature
,
either an object of class kRp.corp.freq
,
or an object of class kRp.text
with the added feature corp_freq
containing it.
# code is only run when the english language package can be loaded if(require("koRpus.lang.en", quietly = TRUE)){ sample_file <- file.path( path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt" ) # call read.corp.custom() on a tokenized text tokenized.obj <- tokenize( txt=sample_file, lang="en" ) # if you call read.corp.custom() without arguments, # you will get its results directly en_corp <- read.corp.custom( tokenized.obj, caseSens=FALSE ) # alternatively, you can also store those results as a # feature in the object itself tokenized.obj <- read.corp.custom( tokenized.obj, caseSens=FALSE, as.feature=TRUE ) # results are now part of the object hasFeature(tokenized.obj) corpusCorpFreq(tokenized.obj) } else {}
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.