Select Variables for a Formula Response or the RHS of a Formula
Select variables from a data frame whose names begin with a certain character string.
Select(data = list(), prefix = "y", lhs = NULL, rhs = NULL, rhs2 = NULL, rhs3 = NULL, as.character = FALSE, as.formula.arg = FALSE, tilde = TRUE, exclude = NULL, sort.arg = TRUE)
data |
A data frame or a matrix. |
prefix |
A vector of character strings, or a logical.
If a character then
the variables chosen from |
lhs |
A character string. The response of a formula. |
rhs |
A character string.
Included as part of the RHS a formula.
Set |
rhs2, rhs3 |
Same as |
as.character |
Logical. Return the answer as a character string? |
as.formula.arg |
Logical. Is the answer a formula? |
tilde |
Logical.
If |
exclude |
Vector of character strings. Exclude these variables explicitly. |
sort.arg |
Logical. Sort the variables? |
This is meant as a utility function to avoid manually:
(i) making a cbind
call to construct
a big matrix response,
and
(ii) constructing a formula involving a lot of terms.
The savings can be made because the variables of interest
begin with some prefix, e.g., with the character "y"
.
If as.character = FALSE
and
as.formula.arg = FALSE
then a matrix such
as cbind(y1, y2, y3)
.
If as.character = TRUE
and
as.formula.arg = FALSE
then a character string such
as "cbind(y1, y2, y3)"
.
If as.character = FALSE
and
as.formula.arg = TRUE
then a formula
such
as lhs ~ y1 + y2 + y3
.
If as.character = TRUE
and
as.formula.arg = TRUE
then a character string such
as "lhs ~ y1 + y2 + y3"
.
See the examples below.
By default, if no variables beginning the the value of prefix
is found then a NULL
is returned.
Setting prefix = " "
is a way of selecting no variables.
This function is a bit experimental at this stage and
may change in the short future.
Some of its utility may be better achieved using
subset
and its select
argument,
e.g., subset(pdata, TRUE, select = y01:y10)
.
For some models such as posbernoulli.t
the
order of the variables in the xij
argument is
crucial, therefore care must be taken with the
argument sort.arg
.
In some instances, it may be good to rename variables
y1
to y01
,
y2
to y02
, etc.
when there are variables such as
y14
.
Currently subsetcol()
and Select()
are identical.
One of these functions might be withdrawn in the future.
T. W. Yee.
Pneumo <- pneumo colnames(Pneumo) <- c("y1", "y2", "y3", "x2") # The "y" variables are response Pneumo$x1 <- 1; Pneumo$x3 <- 3; Pneumo$x <- 0; Pneumo$x4 <- 4 # Add these Select(data = Pneumo) # Same as with(Pneumo, cbind(y1, y2, y3)) Select(Pneumo, "x") Select(Pneumo, "x", sort = FALSE, as.char = TRUE) Select(Pneumo, "x", exclude = "x1") Select(Pneumo, "x", exclude = "x1", as.char = TRUE) Select(Pneumo, c("x", "y")) Select(Pneumo, "z") # Now returns a NULL Select(Pneumo, " ") # Now returns a NULL Select(Pneumo, prefix = TRUE, as.formula = TRUE) Select(Pneumo, "x", exclude = c("x3", "x1"), as.formula = TRUE, lhs = "cbind(y1, y2, y3)", rhs = "0") Select(Pneumo, "x", exclude = "x1", as.formula = TRUE, as.char = TRUE, lhs = "cbind(y1, y2, y3)", rhs = "0") # Now a 'real' example: Huggins89table1 <- transform(Huggins89table1, x3.tij = t01) tab1 <- subset(Huggins89table1, rowSums(Select(Huggins89table1, "y")) > 0) # Same as # subset(Huggins89table1, y1 + y2 + y3 + y4 + y5 + y6 + y7 + y8 + y9 + y10 > 0) # Long way to do it: fit.th <- vglm(cbind(y01, y02, y03, y04, y05, y06, y07, y08, y09, y10) ~ x2 + x3.tij, xij = list(x3.tij ~ t01 + t02 + t03 + t04 + t05 + t06 + t07 + t08 + t09 + t10 - 1), posbernoulli.t(parallel.t = TRUE ~ x2 + x3.tij), data = tab1, trace = TRUE, form2 = ~ x2 + x3.tij + t01 + t02 + t03 + t04 + t05 + t06 + t07 + t08 + t09 + t10) # Short way to do it: Fit.th <- vglm(Select(tab1, "y") ~ x2 + x3.tij, xij = list(Select(tab1, "t", as.formula = TRUE, sort = FALSE, lhs = "x3.tij", rhs = "0")), posbernoulli.t(parallel.t = TRUE ~ x2 + x3.tij), data = tab1, trace = TRUE, form2 = Select(tab1, prefix = TRUE, as.formula = TRUE))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.