Option to Switch On/Off Fast Data Transformations


A significant speed up can be gained by using fast (panel) data transformation functions from package collapse.


By default, this speed up is not enabled. Option can be used to enable/disable the speed up. The option is evaluated prior to execution of supported transformations (see below), so option("" = TRUE) enables the speed up while option("" = FALSE) disables the speed up.

To have it always switched on, put options("" = TRUE) in your .Rprofile file.

See Examples for how to use the option and for a benchmarking example.

By default, package plm uses base R implementations and R-based code. The package collapse provides fast data transformation functions written in C/C++, among them some especially suitable for panel data. Having package collapse installed is a requirement for the speed up. However, this package is currently not a hard dependency for package plm but a 'Suggests' dependency.

Currently, these functions benefit from the speed-up (more functions are under investigation):

  • between,

  • Between,

  • Sum,

  • Within.


## Not run: 
### A benchmark plm without and with speed-up
rm(list = ls())
data("wlddev", package = "collapse")

# produce big data set (taken from collapse's vignette)
wlddevsmall <- get_vars(wlddev, c("iso3c","year","OECD","PCGDP","LIFEEX","GINI","ODA"))
wlddevsmall$iso3c <- as.character(wlddevsmall$iso3c)
data <- replicate(100, wlddevsmall, simplify = FALSE)
uniquify <- function(x, i) {
	x$iso3c <- paste0(x$iso3c, i)
data <- unlist2d(Map(uniquify, data, as.list(1:100)), idcols = FALSE)
data <- pdata.frame(data, index = c("iso3c", "year"))
pdim(data) # Balanced Panel: n = 21600, T = 59, N = 1274400 // but many NAs
# data <- na.omit(data)
# pdim(data) # Unbalanced Panel: n = 13300, T = 1-31, N = 93900

options("" = FALSE) # default: fast functions of 'collapse' not in use
times <- 3 # no. of repetitions for benchmark - this takes quite long!
bench_res_plm_baseR <- microbenchmark(
  plm(form, data = data, model = "within"),
  plm(form, data = data, model = "within", effect = "twoways"),
  plm(form, data = data, model = "random"),
  plm(form, data = data, model = "random", effect = "twoways"),
 times = times)

options("" = TRUE)
bench_res_plm_collapse <- microbenchmark(
  plm(form, data = data, model = "within"),
  plm(form, data = data, model = "within", effect = "twoways"),
  plm(form, data = data, model = "random"),
  plm(form, data = data, model = "random", effect = "twoways"),
 times = times)
print(bench_res_plm_baseR,    unit = "s")
print(bench_res_plm_collapse, unit = "s")

## End(Not run)


Linear Models for Panel Data

GPL (>= 2)
Yves Croissant [aut, cre], Giovanni Millo [aut], Kevin Tappe [aut], Ott Toomet [ctb], Christian Kleiber [ctb], Achim Zeileis [ctb], Arne Henningsen [ctb], Liviu Andronic [ctb], Nina Schoenfelder [ctb]
Initial release

