Centers a set of variables around a set of factors
User-level access to internal demeaning algorithm of fixest
.
demean( X, f, slope.vars, slope.flag, data, weights, nthreads = getFixest_nthreads(), notes = getFixest_notes(), iter = 2000, tol = 1e-06, na.rm = TRUE, as.matrix = is.atomic(X), im_confident = FALSE )
X |
A matrix, vector, data.frame or a list OR a formula. If equal to a formula, then the argument |
f |
A matrix, vector, data.frame or list. The factors used to center the variables in argument |
slope.vars |
A vector, matrix or list representing the variables with varying slopes. Matrices will be coerced using |
slope.flag |
An integer vector of the same length as the number of variables in |
data |
A data.frame containing all variables in the argument |
weights |
Vector, can be missing or NULL. If present, it must contain the same number of observations as in |
nthreads |
Number of threads to be used. By default it is equal to |
notes |
Logical, whether to display a message when NA values are removed. By default it is equal to |
iter |
Number of iterations, default is 2000. |
tol |
Stopping criterion of the algorithm. Default is |
na.rm |
Logical, default is |
as.matrix |
Logical, if |
im_confident |
Logical, default is |
It returns a data.frame of the same number of columns as the number of variables to be centered.
If na.rm = TRUE
, then the number of rows is equal to the number of rows in input minus the number of NA values (contained in X
, f
, slope.vars
or weights
). The default is to have an output of the same number of observations as the input (filled with NAs where appropriate).
A matrix can be returned if as.matrix = TRUE
.
You can add variables with varying slopes in the fixed-effect part of the formula. The syntax is as follows: fixef_var[var1, var2]. Here the variables var1 and var2 will be with varying slopes (one slope per value in fixef_var) and the fixed-effect fixef_var will also be added.
To add only the variables with varying slopes and not the fixed-effect, use double square brackets: fixef_var[[var1, var2]].
In other words:
fixef_var[var1, var2] is equivalent to fixef_var + fixef_var[[var1]] + fixef_var[[var2]]
fixef_var[[var1, var2]] is equivalent to fixef_var[[var1]] + fixef_var[[var2]]
In general, for convergence reasons, it is recommended to always add the fixed-effect and avoid using only the variable with varying slope (i.e. use single square brackets).
# Illustration of the FWL theorem data(trade) base = trade base$ln_dist = log(base$dist_km) base$ln_euros = log(base$Euros) # We center the two variables ln_dist and ln_euros # on the factors Origin and Destination X_demean = demean(X = base[, c("ln_dist", "ln_euros")], f = base[, c("Origin", "Destination")]) base[, c("ln_dist_dm", "ln_euros_dm")] = X_demean est = feols(ln_euros_dm ~ ln_dist_dm, base) est_fe = feols(ln_euros ~ ln_dist | Origin + Destination, base) # The results are the same as if we used the two factors # as fixed-effects etable(est, est_fe, se = "st") # # Variables with varying slopes # # You can center on factors but also on variables with varying slopes # Let's have an illustration base = iris names(base) = c("y", "x1", "x2", "x3", "species") # # We center y and x1 on species and x2 * species # using a formula base_dm = demean(y + x1 ~ species[x2], data = base) # using vectors base_dm_bis = demean(X = base[, c("y", "x1")], f = base$species, slope.vars = base$x2, slope.flag = 1) # Let's look at the equivalences res_vs_1 = feols(y ~ x1 + species + x2:species, base) res_vs_2 = feols(y ~ x1, base_dm) res_vs_3 = feols(y ~ x1, base_dm_bis) # only the small sample adj. differ in the SEs etable(res_vs_1, res_vs_2, res_vs_3, keep = "x1") # # center on x2 * species and on another FE base$fe = rep(1:5, 10) # using a formula => double square brackets! base_dm = demean(y + x1 ~ fe + species[[x2]], data = base) # using vectors => note slope.flag! base_dm_bis = demean(X = base[, c("y", "x1")], f = base[, c("fe", "species")], slope.vars = base$x2, slope.flag = c(0, -1)) # Explanations slope.flag = c(0, -1): # - the first 0: the first factor (fe) is associated to no variable # - the "-1": # * |-1| = 1: the second factor (species) is associated to ONE variable # * -1 < 0: the second factor should not be included as such # Let's look at the equivalences res_vs_1 = feols(y ~ x1 + i(fe) + x2:species, base) res_vs_2 = feols(y ~ x1, base_dm) res_vs_3 = feols(y ~ x1, base_dm_bis) # only the small sample adj. differ in the SEs etable(res_vs_1, res_vs_2, res_vs_3, keep = "x1")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.