Centers a set of variables around a set of factors
User-level access to internal demeaning algorithm of fixest.
demean( X, f, slope.vars, slope.flag, data, weights, nthreads = getFixest_nthreads(), notes = getFixest_notes(), iter = 2000, tol = 1e-06, na.rm = TRUE, as.matrix = is.atomic(X), im_confident = FALSE )
X |
A matrix, vector, data.frame or a list OR a formula. If equal to a formula, then the argument |
f |
A matrix, vector, data.frame or list. The factors used to center the variables in argument |
slope.vars |
A vector, matrix or list representing the variables with varying slopes. Matrices will be coerced using |
slope.flag |
An integer vector of the same length as the number of variables in |
data |
A data.frame containing all variables in the argument |
weights |
Vector, can be missing or NULL. If present, it must contain the same number of observations as in |
nthreads |
Number of threads to be used. By default it is equal to |
notes |
Logical, whether to display a message when NA values are removed. By default it is equal to |
iter |
Number of iterations, default is 2000. |
tol |
Stopping criterion of the algorithm. Default is |
na.rm |
Logical, default is |
as.matrix |
Logical, if |
im_confident |
Logical, default is |
It returns a data.frame of the same number of columns as the number of variables to be centered.
If na.rm = TRUE, then the number of rows is equal to the number of rows in input minus the number of NA values (contained in X, f, slope.vars or weights). The default is to have an output of the same number of observations as the input (filled with NAs where appropriate).
A matrix can be returned if as.matrix = TRUE.
You can add variables with varying slopes in the fixed-effect part of the formula. The syntax is as follows: fixef_var[var1, var2]. Here the variables var1 and var2 will be with varying slopes (one slope per value in fixef_var) and the fixed-effect fixef_var will also be added.
To add only the variables with varying slopes and not the fixed-effect, use double square brackets: fixef_var[[var1, var2]].
In other words:
fixef_var[var1, var2] is equivalent to fixef_var + fixef_var[[var1]] + fixef_var[[var2]]
fixef_var[[var1, var2]] is equivalent to fixef_var[[var1]] + fixef_var[[var2]]
In general, for convergence reasons, it is recommended to always add the fixed-effect and avoid using only the variable with varying slope (i.e. use single square brackets).
# Illustration of the FWL theorem
data(trade)
base = trade
base$ln_dist = log(base$dist_km)
base$ln_euros = log(base$Euros)
# We center the two variables ln_dist and ln_euros
# on the factors Origin and Destination
X_demean = demean(X = base[, c("ln_dist", "ln_euros")],
f = base[, c("Origin", "Destination")])
base[, c("ln_dist_dm", "ln_euros_dm")] = X_demean
est = feols(ln_euros_dm ~ ln_dist_dm, base)
est_fe = feols(ln_euros ~ ln_dist | Origin + Destination, base)
# The results are the same as if we used the two factors
# as fixed-effects
etable(est, est_fe, se = "st")
#
# Variables with varying slopes
#
# You can center on factors but also on variables with varying slopes
# Let's have an illustration
base = iris
names(base) = c("y", "x1", "x2", "x3", "species")
#
# We center y and x1 on species and x2 * species
# using a formula
base_dm = demean(y + x1 ~ species[x2], data = base)
# using vectors
base_dm_bis = demean(X = base[, c("y", "x1")], f = base$species,
slope.vars = base$x2, slope.flag = 1)
# Let's look at the equivalences
res_vs_1 = feols(y ~ x1 + species + x2:species, base)
res_vs_2 = feols(y ~ x1, base_dm)
res_vs_3 = feols(y ~ x1, base_dm_bis)
# only the small sample adj. differ in the SEs
etable(res_vs_1, res_vs_2, res_vs_3, keep = "x1")
#
# center on x2 * species and on another FE
base$fe = rep(1:5, 10)
# using a formula => double square brackets!
base_dm = demean(y + x1 ~ fe + species[[x2]], data = base)
# using vectors => note slope.flag!
base_dm_bis = demean(X = base[, c("y", "x1")], f = base[, c("fe", "species")],
slope.vars = base$x2, slope.flag = c(0, -1))
# Explanations slope.flag = c(0, -1):
# - the first 0: the first factor (fe) is associated to no variable
# - the "-1":
# * |-1| = 1: the second factor (species) is associated to ONE variable
# * -1 < 0: the second factor should not be included as such
# Let's look at the equivalences
res_vs_1 = feols(y ~ x1 + i(fe) + x2:species, base)
res_vs_2 = feols(y ~ x1, base_dm)
res_vs_3 = feols(y ~ x1, base_dm_bis)
# only the small sample adj. differ in the SEs
etable(res_vs_1, res_vs_2, res_vs_3, keep = "x1")Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.