Defining the fSet Input Variable
Several of the statistical methods implemented in package DynTxRegime allow for subset modeling or limiting of feasible treatment options. This section details how this input is to be defined.
In general, input fSet
is used to define subsets of patients
within an analysis. These subsets can be specified to (1) limit
available treatments, (2) use different models for the propensity
score and/or outcome regressions, and/or
(3) use different decision function models for
each subset of patients. The combination of inputs moPropen
,
moMain
, moCont
, fSet
, and/or regimes
determines which of these scenarios is
being considered. We cover some common situations below.
Regardless of the purpose for specifying fSet
, it must be a
function that returns a list. There are two options for defining the
function. Version 1 is that of the original DynTxRegime package.
In this version, fSet
defines the rules
for determining the subset of treatment options for an INDIVIDUAL.
The first element of the returned list is a character, which we term
the subset 'nickname.' This nickname is for bookkeeping purposes
and is used to link models to subsets. The second element
of the returned list is a vector
of available treatment options for the subset. The formal arguments of
the function must include (i) 'data' or (ii) individual covariate
names as given by the column headers of data
. An example using the
covariate name input form is
fSet <- function(a1) { if (a1 > 1) { subset <- list('subA',c(1,2)) } else { subset <- list('subB',c(3,4) ) } return(subset) }
This function indicates that if an individual has covariate a1 > 1, they are a member of subset 'subA' and their feasible treatment options are {1,2}. If a1 <= 1, they are a member of subset 'subB' and their feasible treatment options are {3,4}.
A more efficient implementation for fSet
is now accepted. In
the second form, fSet
defines the subset of treatment options
for the full DATASET. It is again a function with
formal arguments (i) 'data' or (ii) individual covariate names as
given by the column headers of data
. The function returns a list
containing two elements: 'subsets' and 'txOpts.' Element 'subsets' is
a list comprising all treatment subsets; each element of the list contains
the nickname and treatment options for a single subset. Element
'txOpts' is a character vector indicating the subset of which
each individual is a member. In this new format,
the equivalent definition of fSet
as that given above is:
fSet <- function(a1) { subsets <- list(list('subA', c(1,2)), list('subB', c(3,4))) txOpts <- rep('subB', length(x = a1)) txOpts[a1 > 1] <- 'subA' return(list("subsets" = subsets, "txOpts" = txOpts)) }
Though a bit more complicated, this version is much more efficient as it processes the entire dataset at once rather than each individual separately.
The simplest scenario involving fSet
is to define feasible
treatment options and the rules that dictate how those treatment
options are determined. For example,
responder/non-responder scenarios are often encountered in
multiple-decision-point settings. An example of this scenario is:
patients that respond to the first stage treatment
remain on the original treatment; those that
do not respond to the first stage treatment
have all treatment options available to them at the second stage.
In this case, the
propensity score models for the second stage
are fit using only 'non-responders' for whom
more than 1 treatment option is available.
An example of an appropriate fSet
function for
the second-stage is
fSet <- function(data) { if (data\$responder == 0L) { subset <- list('subA',c(1L,2L)) } else if (data\$tx1 == 1L) { subset <- list('subB',c(1L) ) } else if (data\$tx1 == 2L) { subset <- list('subC',c(2L) ) } return(subset) }
for version 1 or for version 2
fSet <- function(data) { subsets <- list(list('subA', c(1L,2L)), list('subB', c(1L)), list('subC', c(2L))) txOpts <- character(nrow(x = data)) txOpts[data$tx1 == 1L] <- 'subB' txOpts[data$tx1 == 2L] <- 'subC' txOpts[data$responder == 0L] <- 'subA' return(list("subsets" = subsets, "txOpts" = txOpts)) }
The functions above specify that patients with covariate responder = 0
receive treatments from subset 'subA,' which comprises treatments
A = (1,2). Patients with covariate responder = 1 receive treatment
from subset 'subB' or 'subC' depending on the first stage treatment
received. If
fSet
is specified in this way, the form of the model object depends
on the training data. Specifically, if the training data obeys the feasible
treatment rule (here, all individuals with responder = 1 received tx
in accordance with fSet), moPropen
would be a "modelObj"
;
the propensity model will be fit using only those patients with
responder = 0; those with responder = 1 always receive the appropriate
second stage treatment with probability 1.0. However, if the data
are from an observation study and the training data do not obey the
feasible treatment rules (here, some individuals with responder = 1 received
tx = 0; others tx = 1), the responder = 1 data must be modeled and moPropen
must be provided as one or more ModelObjSubset() objects.
If outcome regression is used by the method,
moMain
and moCont
can be either objects
of class "modelObj"
if only responder = 0 patients are to be used
to obtain parameter estimates or as lists of objects of class
"ModelObjSubset"
if subsets are to be analyzed individually or
combined for a single fit of all data.
For a scenario where all patients have the same set of treatment
options available, but subsets of patients are to be analyzed using
different models. We cane define fSet
as
fSet <- function(data) { if (data\$a1 == 1) { subset <- list('subA',c(1L,2L)) } else { subset <- list('subB',c(1L,2L) ) } return(subset) }
for version 1 or in the format of version 2
fSet <- function(data) { subsets <- list(list('subA', c(1L,2L)), list('subB', c(1L,2L))) txOpts <- rep('subB', nrow(x = data)) txOpts[data$a1 == 1L] <- 'subA' return(list("subsets" = subsets, "txOpts" = txOpts)) }
where all patients have the same treatment options available, A = (1,2),
but different regression models will be fit for each subset (case 2 above)
and/or different decision function models (case 3 above) for each
subset. If different propensity score models are used, moPropen
must be a list of objects of class "modelObjSubset."
Perhaps,
propenA <- buildModelObjSubset(model = ~1, solver.method = 'glm', solver.args = list('family'='binomial'), predict.method = 'predict.glm', predict.args = list(type='response'), subset = 'subA') propenB <- buildModelObjSubset(model = ~1, solver.method = 'glm', solver.args = list('family'='binomial'), predict.method = 'predict.glm', predict.args = list(type='response'), subset = 'subB') moPropen <- list(propenA, propenB)
If different decision function models are to be fit, regimes
would take a form similar to
regimes <- list( 'subA' = ~x1 + x2, 'subB' = ~x2 )
Notice that the names of the elements of regimes
and the subsets passed to
buildModelObjSubset() correspond to the names defined by fSet
,
i.e., 'subA' or 'subB.' These nicknames are used for bookkeeping and
link subsets to the appropriate models.
For a single-decision-point analysis, fSet
is a single function. For multiple-decision-point analyses,
fSet
is a list of functions where each element of
the list corresponds to the decision point (1st element <-
1st decision point, etc.)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.