Covariate Balancing Propensity Score Weighting
This page explains the details of estimating weights from covariate balancing propensity scores by setting method = "cbps"
in the call to weightit()
or weightitMSM()
. This method can be used with binary, multinomial, and continuous treatments.
For binary treatments, this method estimates the propensity scores and weights using CBPS()
. The following estimands are allowed: ATE, ATT, and ATC. The weights are taken from the output of the CBPS
fit object. When the estimand is the ATE, the return propensity score is the probability of being in the "second" treatment group, i.e., levels(factor(treat))[2]
; when the estimand is the ATC, the returned propensity score is the probability of being in the control (i.e., non-focal) group.
For multinomial treatments with three or four categories and when the estimand is the ATE, this method estimates the propensity scores and weights using one call to CBPS()
. For multinomial treatments with three or four categories or when the estimand is the ATT, this method estimates the propensity scores and weights using multiple calls to CBPS()
. The following estimands are allowed: ATE and ATT. The weights are taken from the output of the CBPS
fit objects.
For continuous treatments, the generalized propensity score and weights are estimated using CBPS()
.
For longitudinal treatments, the weights are the product of the weights estimated at each time point. This is not how CBMSM()
in the CBPS package estimates weights for longitudinal treatments.
Sampling weights are supported through s.weights
in all scenarios. See Note about sampling weights.
In the presence of missing data, the following value(s) for missing
are allowed:
"ind"
(default)First, for each variable with missingness, a new missingness indicator variable is created which takes the value 1 if the original covariate is NA
and 0 otherwise. The missingness indicators are added to the model formula as main effects. The missing values in the covariates are then replaced with 0s (this value is arbitrary and does not affect estimation). The weight estimation then proceeds with this new formula and set of covariates. The covariates output in the resulting weightit
object will be the original covariates with the NA
s.
CBPS estimates the coefficients of a logistic regression model (for binary treatments), multinomial logistic regression model (form multinomial treatments), or linear regression model (for continuous treatments) that is used to compute (generalized) propensity scores, from which the weights are computed. It involves augmenting the standard regression score equations with the balance constraints in an over-identified generalized method of moments estimation. The idea is to nudge the estimation of the coefficients toward those that produce balance in the weighted sample. The just-identified version (with exact = FALSE
) does away with the score equations for the coefficients so that only the balance constraints (and the score equation for the variance of the error with a continuous treatment) are used. The just-identified version will therefore produce superior balance on the means (i.e., corresponding to the balance constraints) for binary and multinomial treatments and linear terms for continuous treatments than will the over-identified version.
Note that WeightIt provides less functionality than does the CBPS package in terms of the versions of CBPS available; for extensions to CBPS, the CBPS package may be preferred.
All arguments to CBPS()
can be passed through weightit()
or weightitMSM()
, with the following exceptions:
method
in CBPS()
is replaced with the argument over
in weightit()
. Setting over = FALSE
in weightit()
is the equivalent of setting method = "exact"
in CBPS()
.
sample.weights
is ignored because sampling weights are passed using s.weights
.
standardize
is ignored.
All arguments take on the defaults of those in CBPS()
. It may be useful in many cases to set over = FALSE
, especially with continuous treatments.
obj
When include.obj = TRUE
, the CB(G)PS model fit. For binary treatments, multinomial treatments with estimand = "ATE"
and four or fewer treatment levels, and continuous treatments, the output of the call to CBPS()
. For multinomial treatments with estimand = "ATT"
or with more than four treatment levels, a list of CBPS
fit objects.
When sampling weights are used with CBPS::CBPS()
, the estimated weights already incorporate the sampling weights. When weightit()
is used with method = "cbps"
, the estimated weights are separated from the sampling weights, as they are with all other methods.
Binary treatments
Imai, K., & Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 243–263.
Multinomial Treatments
Imai, K., & Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 243–263.
Continuous treatments
Fong, C., Hazlett, C., & Imai, K. (2018). Covariate balancing propensity score for a continuous treatment: Application to the efficacy of political advertisements. The Annals of Applied Statistics, 12(1), 156–177. doi: 10.1214/17-AOAS1101
CBPS::CBPS()
for the fitting function
library("cobalt") data("lalonde", package = "cobalt") #Balancing covariates between treatment groups (binary) (W1 <- weightit(treat ~ age + educ + married + nodegree + re74, data = lalonde, method = "cbps", estimand = "ATT")) summary(W1) bal.tab(W1) ## Not run: #Balancing covariates with respect to race (multinomial) (W2 <- weightit(race ~ age + educ + married + nodegree + re74, data = lalonde, method = "cbps", estimand = "ATE")) summary(W2) bal.tab(W2) ## End(Not run) #Balancing covariates with respect to re75 (continuous) (W3 <- weightit(re75 ~ age + educ + married + nodegree + re74, data = lalonde, method = "cbps", over = FALSE)) summary(W3) bal.tab(W3)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.