Get corpus positions for a query or queries.
Get matches for a query in a CQP corpus (subcorpus, partition etc.), optionally using the CQP syntax of the Corpus Workbench (CWB).
cpos(.Object, ...)
## S4 method for signature 'corpus'
cpos(
.Object,
query,
p_attribute = getOption("polmineR.p_attribute"),
cqp = is.cqp,
regex = FALSE,
check = TRUE,
verbose = TRUE,
...
)
## S4 method for signature 'character'
cpos(
.Object,
query,
p_attribute = getOption("polmineR.p_attribute"),
cqp = is.cqp,
check = TRUE,
verbose = TRUE,
...
)
## S4 method for signature 'slice'
cpos(
.Object,
query,
cqp = is.cqp,
check = TRUE,
p_attribute = getOption("polmineR.p_attribute"),
verbose = TRUE,
...
)
## S4 method for signature 'partition'
cpos(
.Object,
query,
cqp = is.cqp,
check = TRUE,
p_attribute = getOption("polmineR.p_attribute"),
verbose = TRUE,
...
)
## S4 method for signature 'subcorpus'
cpos(
.Object,
query,
cqp = is.cqp,
check = TRUE,
p_attribute = getOption("polmineR.p_attribute"),
verbose = TRUE,
...
)
## S4 method for signature 'matrix'
cpos(.Object)
## S4 method for signature 'hits'
cpos(.Object)
## S4 method for signature ''NULL''
cpos(.Object).Object |
A length-one |
... |
Used for reasons of backwards compatibility to
process arguments that have been renamed (e.g. |
query |
A |
p_attribute |
The p-attribute to search. Needs to be stated only if query
is not a CQP query. Defaults to |
cqp |
Either logical ( |
regex |
Interpret |
check |
A |
verbose |
A |
If the cpos()-method is applied on a character or
partition object, the result is a two-column matrix with the
regions (start end end corpus positions of the matches) for a query. CQP
syntax can be used. The encoding of the query is adjusted to conform to the
encoding of the CWB corpus. If there are not matches, NULL is
returned.
If the cpos()-method is called on a matrix object, the cpos
matrix is unfolded, the return value is an integer vector with the individual
corpus positions.
If .Object is a hits object, an integer vector is
returned with the individual corpus positions.
. If .Object is a matrix, it is assumed to be a region
matrix, i.e. a two-column matrix with left and right corpus positions
in the first and second row, respectively. For many operations, such as
decoding the token stream, it is necessary to inflate the denoted regions
into a vector of all corpus positions referred to by the regions defined in
the matrix. The cpos-method for matrix objects will performs
this task robustly.
If .Object is NULL, the method will return an empty
integer vector. Used internally to handle NULL objects that may be
returned from the cpos-method if no matches are obtained for a
query.
Unless .Object is a matrix, the return value is a
matrix with two columns. The first column reports the left/starting
corpus positions (cpos) of the hits obtained. The second column reports the
right/ending corpus positions of the respective hit. The number of rows is
the number of hits. If there are no hits, a NULL object is returned.
use("polmineR")
# looking up single tokens
cpos("REUTERS", query = "oil")
corpus("REUTERS") %>% cpos(query = "oil")
corpus("REUTERS") %>% subset(grepl("saudi-arabia", places)) %>% cpos(query = "oil")
partition("REUTERS", places = "saudi-arabia", regex = TRUE) %>% cpos(query = "oil")
# using CQP query syntax
cpos("REUTERS", query = '"Saudi" "Arabia"')
corpus("REUTERS") %>% cpos(query = '"Saudi" "Arabia"')
corpus("REUTERS") %>%
subset(grepl("saudi-arabia", places)) %>%
cpos(query = '"Saudi" "Arabia"', cqp = TRUE)
partition("REUTERS", places = "saudi-arabia", regex = TRUE) %>%
cpos(query = '"Saudi" "Arabia"', cqp = TRUE)Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.