Factor objects
A notable difference with ordinary factors is that Factor objects cannot
contain NA
s, at least for now.
Factor(x, levels, index=NULL, ...) # constructor function
x, levels |
At least one of When When |
index |
|
... |
Optional metadata columns. |
There are 4 different ways to use the Factor()
constructor function:
Factor(x, levels)
(i.e. index
is missing):
In this case match(x, levels)
is used internally to encode
x
as a Factor object. An error is returned if some elements
in x
cannot be matched to levels
so it's important to
make sure that all the elements in x
are represented in
levels
when doing Factor(x, levels)
.
Factor(x)
(i.e. levels
and index
are missing):
This is equivalent to Factor(x, levels=unique(x))
.
Factor(levels=levels, index=index)
(i.e. x
is missing):
In this case the encoding of the Factor object is supplied via
index
, that is, index
must be an integer (or numeric)
vector of valid positive indices (no NA
s) into levels
.
This is the most efficient way to construct a Factor object.
Factor(levels=levels)
(i.e. x
and index
are
missing): This is a convenient way to construct a 0-length Factor
object with the specified levels. In other words, it's equivalent
to Factor(levels=levels, index=integer(0))
.
A Factor object.
Factor objects support the same set of accessors as ordinary factors. That is:
length(x)
to get the length of Factor object x
.
names(x)
and names(x) <- value
to get and set the
names of Factor object x
.
levels(x)
and levels(x) <- value
to get and set the
levels of Factor object x
.
nlevels(x)
to get the number of levels of Factor
object x
.
as.integer(x)
to get the encoding of Factor object x
.
Note that length(as.integer(x))
and
names(as.integer(x))
are the same as length(x)
and names(x)
, respectively.
In addition, because Factor objects are Vector derivatives, they
support the mcols()
and metadata()
getters and setters.
unfactor(x)
can be used to decode Factor object x
.
It returns an object of the same class as levels(x)
and same length
as x
. Note that it is the analog of as.character()
on ordinary
factors, with the notable difference that unfactor(x)
propagates the
names on x
.
For convenience, unfactor(x)
also works on ordinary factor x
.
unfactor()
supports extra arguments use.names
and
ignore.mcols
to control whether the names and metadata columns
on the Factor object to decode should be propagated or not.
By default they are propagated, that is, the default values for
use.names
and ignore.mcols
are TRUE
and
FALSE
, respectively.
From vector or Vector to Factor: coercion of a vector-like object x
to Factor is supported via as(x, "Factor")
and is equivalent to
Factor(x)
. There are 2 IMPORTANT EXCEPTIONS to this:
If x
is an ordinary factor, as(x, "Factor")
returns
a Factor with the same levels, encoding, and names, as x
.
Note that after coercing an ordinary factor to Factor, going back
to factor again (with as.factor()
) restores the original
object with no loss.
If x
is a Factor object, as(x, "Factor")
is either
a no-op (when x
is a Factor instance), or a
demotion to Factor (when x
is a Factor derivative like
GRangesFactor).
From Factor to integer: as.integer(x)
is supported on Factor object
x
and returns its encoding (see Accessors section above).
From Factor to factor: as.factor(x)
is supported on Factor object
x
and returns an ordinary factor where the levels are
as.character(levels(x))
.
From Factor to character: as.character(x)
is supported on Factor
object x
and is equivalent to unfactor(as.factor(x))
, which
is also equivalent to as.character(unfactor(x))
.
A Factor object can be subsetted with [
, like an ordinary factor.
2 or more Factor objects can be concatenated with c()
.
Note that, unlike with ordinary factors, c()
on Factor objects
preserves the class i.e. it returns a Factor object. In other words,
c()
acts as an endomorphism on Factor objects.
The levels of c(x, y)
are obtained by appending to levels(x)
the levels in levels(y)
that are "new" i.e. that are not already
in levels(x)
.
append()
, which is implemented on top of c()
, also works
on Factor objects.
Comparing (e.g. ==
, !=
, <=
, <
, match()
)
and ordering (e.g. order()
, sort()
, rank()
) Factor
objects is supported and behave like on the unfactored objects.
For example F1 <= F2
, match(F1, F2)
, and sort(F1)
,
are equivalent to unfactor(F1) <= unfactor(F2)
,
match(unfactor(F1), unfactor(F2))
, and sort(unfactor(F1))
,
respectively.
Hervé Pagès, with contributions from Aaron Lun
factor in base R.
GRangesFactor objects in the GenomicRanges package.
IRanges objects in the IRanges package.
Vector objects for the parent class.
anyDuplicated
in the BiocGenerics
package.
showClass("Factor") # Factor extends Vector ## --------------------------------------------------------------------- ## CONSTRUCTOR & ACCESSORS ## --------------------------------------------------------------------- library(IRanges) set.seed(123) ir0 <- IRanges(sample(5, 8, replace=TRUE), width=10, names=letters[1:8], ID=paste0("ID", 1:8)) ## Use explicit levels: ir1 <- IRanges(1:6, width=10) F1 <- Factor(ir0, levels=ir1) F1 length(F1) names(F1) levels(F1) # ir1 nlevels(F1) as.integer(F1) # encoding ## If we don't specify the levels, they'll be set to unique(ir0): F2 <- Factor(ir0) F2 length(F2) names(F2) levels(F2) # unique(ir0) nlevels(F2) as.integer(F2) ## --------------------------------------------------------------------- ## DECODING ## --------------------------------------------------------------------- unfactor(F1) stopifnot(identical(ir0, unfactor(F1))) stopifnot(identical(ir0, unfactor(F2))) unfactor(F1, use.names=FALSE) unfactor(F1, ignore.mcols=TRUE) ## --------------------------------------------------------------------- ## COERCION ## --------------------------------------------------------------------- F2b <- as(ir0, "Factor") # same as Factor(ir0) stopifnot(identical(F2, F2b)) as.factor(F2) as.factor(F1) as.character(F1) # same as unfactor(as.factor(F1)), # and also same as as.character(unfactor(F1)) ## On an ordinary factor 'f', 'as(f, "Factor")' and 'Factor(f)' are ## NOT the same: f <- factor(sample(letters, 500, replace=TRUE), levels=letters) as(f, "Factor") # same levels as 'f' Factor(f) # levels **are** 'f'! stopifnot(identical(f, as.factor(as(f, "Factor")))) ## --------------------------------------------------------------------- ## CONCATENATION ## --------------------------------------------------------------------- ir3 <- IRanges(c(5, 2, 8:6), width=10) F3 <- Factor(levels=ir3, index=2:4) F13 <- c(F1, F3) F13 levels(F13) stopifnot(identical(c(unfactor(F1), unfactor(F3)), unfactor(F13))) ## --------------------------------------------------------------------- ## COMPARING & ORDERING ## --------------------------------------------------------------------- F1 == F2 # same as unfactor(F1) == unfactor(F2) order(F1) # same as order(unfactor(F1)) order(F2) # same as order(unfactor(F2)) ## The levels of the Factor influence the order of the table: table(F1) table(F2)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.