Split corpus or partition into speeches.
Split entire corpus or a partition into speeches. The heuristic is to split
the corpus/partition into partitions on day-to-day basis first, using the
s-attribute provided by s_attribute_date
. These subcorpora are then
splitted into speeches by speaker name, using s-attribute
s_attribute_name
. If there is a gap larger than the number of tokens
supplied by argument gap
, contributions of a speaker are assumed to be
two seperate speeches.
as.speeches(.Object, ...) ## S4 method for signature 'partition' as.speeches( .Object, s_attribute_date = grep("date", s_attributes(.Object), value = TRUE), s_attribute_name = grep("name", s_attributes(.Object), value = TRUE), gap = 500, mc = FALSE, verbose = TRUE, progress = TRUE ) ## S4 method for signature 'subcorpus' as.speeches( .Object, s_attribute_date = grep("date", s_attributes(.Object), value = TRUE), s_attribute_name = grep("name", s_attributes(.Object), value = TRUE), gap = 500, mc = FALSE, verbose = TRUE, progress = TRUE ) ## S4 method for signature 'corpus' as.speeches( .Object, s_attribute_date = grep("date", s_attributes(.Object), value = TRUE), s_attribute_name = grep("name", s_attributes(.Object), value = TRUE), gap = 500, mc = FALSE, verbose = TRUE, progress = TRUE ) ## S4 method for signature 'character' as.speeches( .Object, s_attribute_date = grep("date", s_attributes(.Object), value = TRUE), s_attribute_name = grep("name", s_attributes(.Object), value = TRUE), gap = 500, mc = FALSE, verbose = TRUE, progress = TRUE )
.Object |
A |
... |
Further arguments. |
s_attribute_date |
A length-one |
s_attribute_name |
A length-one |
gap |
An |
mc |
Whether to use multicore, defaults to |
verbose |
A |
progress |
A |
A partition_bundle
, the names of the objects in the bundle are
the speaker name, the date of the speech and an index for the number of the
speech on a given day, concatenated by underscores.
use("polmineR") speeches <- as.speeches( "GERMAPARLMINI", s_attribute_date = "date", s_attribute_name = "speaker" ) speeches_count <- count(speeches, p_attribute = "word") tdm <- as.TermDocumentMatrix(speeches_count, col = "count") bt <- partition("GERMAPARLMINI", date = "2009-10-27") speeches <- as.speeches(bt, s_attribute_name = "speaker") summary(speeches) sp <- as.speeches(.Object = corpus("GERMAPARLMINI"), s_attribute_name = "speaker")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.