Generalized Propensity Score Estimation using GBM
ps.cont
calculates generalized propensity scores and corresponding weights using boosted linear regression as implemented in gbm
. This function extends ps
in twang to continuous treatments. The syntax and output are largely the same. The GBM parameter defaults are those found in Zhu, Coffman, & Ghosh (2015).
Note: ps.cont
will phased out when twang adds functionality for continuous treatments. All functionality and more is already present in weightit
with method_gbm[method = "gbm"]
.
ps.cont(formula, data, n.trees = 20000, interaction.depth = 4, shrinkage = 0.0005, bag.fraction = 1, print.level = 0, verbose = FALSE, stop.method, sampw = NULL, optimize = 1, use.kernel = FALSE, ...) ## S3 method for class 'ps.cont' summary(object, ...) ## S3 method for class 'ps.cont' plot(x, ...) ## S3 method for class 'ps.cont' boxplot(x, ...)
formula |
A formula for the propensity score model with the treatment indicator on the left side of the formula and the potential confounding variables on the right side. |
data |
The dataset in the form of a data frame, which should include treatment assignment as well as the covariates specified in |
n.trees |
The number of GBM iterations passed on to |
interaction.depth |
The |
shrinkage |
The |
bag.fraction |
The |
print.level |
Currently ignored. |
verbose |
If |
stop.method |
A method or methods of measuring and summarizing balance across pretreatment variables. Current options are |
sampw |
Optional sampling weights. |
optimize |
A numeric value, either |
use.kernel |
Whether to use kernel density estimation as implemented in |
object, x |
A |
... |
For For |
ps.cont
extends ps
in twang to continuous treatments. It estimates weights from a series of trees and then outputs the weights that optimize a user-set criterion. The criterion employed involves the correlation between the treatment and each covariate. In a fully balanced sample, the treatment will have a correlation of 0 with covariates sufficient for removing confounding. Zhu, Coffman, & Ghosh (2015), who were the first to describe GBM for propensity score weighting with continuous treatments, recommend this procedure and provided R code to implement the methods they describe. ps.cont
adapts their syntax to make it consistent with that of ps
in twang. As in Zhu et al. (2015), when the Pearson correlation is requested, weighted biserial correlations will be computed for binary covariates.
The weights are estimated as the marginal density of the treatment divided by the conditional density of the treatment on the covariates for each unit. For the marginal density, a kernel density estimator can be implemented using the density
function. For the conditional density, a Gaussian density is assumed. Note that with treatment with outlying values, extreme weights can be produced, so it is important to examine the weights and trim them if necessary.
It is recommended to use as many trees as possible, though this requires more computation time, especially with use.optimize
set to 0
. There is little difference between using Pearson and Spearman correlations or between using the raw correlations and the Z-transformed correlations. Typically the only gbm
-related options that should be changed are the interaction depth and number of trees.
Missing data is not allowed in the covariates because of the ambiguity in computing correlations with missing values.
summary.ps.cont
compresses the information in the desc
component of the ps.cont
object into a short summary table describing the size of the dataset and the quality of the generalized propensity score weights, in a similar way to summary.ps
.
plot.ps.cont
and boxplot.ps.cont
function almost identically to plot.ps
and boxplot.ps
. See the help pages there for more information. Note that for plot.ps
, only options 1, 2, and 6 are available for the plots
argument. When use.optimize = 2
, option 1 is not available.
Returns an object of class ps
and ps.cont
, a list containing
gbm.obj |
The returned |
treat |
The treatment variable. |
desc |
a list containing balance tables for each method selected in
|
ps |
a data frame containing the estimated generalized propensity scores. Each column is associated with one of the methods selected in |
w |
a data frame containing the propensity score weights. Each column is associated with one of the methods selected in |
estimand |
|
datestamp |
Records the date of the analysis. |
parameters |
Saves the |
alerts |
|
iters |
A sequence of iterations used in the GBM fits used by |
balance |
The balance summary for each tree examined, with a column for each stop.method. If |
n.trees |
Maximum number of trees considered in GBM fit. |
data |
Data as specified in the |
The NULL
entries exist so the output object is similar to that of ps
in twang.
Noah Greifer
ps.cont
is heavily adapted from the R code in Zhu, Coffman, & Ghosh (2015). In contrast with their code, ps.cont
uses weighted Pearson and Spearman correlations rather than probability weighted bootstrapped correlations, allows for different degrees of optimization in searching for the best solution, and allows for the use of kernel density estimation for the generalized propensity score. ps.cont
also takes inspiration from ps
in twang.
Zhu, Y., Coffman, D. L., & Ghosh, D. (2015). A Boosting Algorithm for Estimating Generalized Propensity Scores with Continuous Treatments. Journal of Causal Inference, 3(1). doi: 10.1515/jci-2014-0022
weightit
and method_gbm
for its implementation using weightit
syntax.
gbm
for the underlying machinery and explanation of the parameters.
# Examples take a long time ## Not run: library("cobalt") data("lalonde", package = "cobalt") #Balancing covariates with respect to re75 psc.out <- ps.cont(re75 ~ age + educ + married + nodegree + race + re74, data = lalonde, stop.method = c("p.mean", "p.max"), use.optimize = 2) summary(psc.out) twang::bal.table(psc.out) #twang's bal.table ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.