Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

sdf_pivot

Pivot a Spark DataFrame


Description

Construct a pivot table over a Spark Dataframe, using a syntax similar to that from reshape2::dcast.

Usage

sdf_pivot(x, formula, fun.aggregate = "count")

Arguments

x

A spark_connection, ml_pipeline, or a tbl_spark.

formula

A two-sided R formula of the form x_1 + x_2 + ... ~ y_1. The left-hand side of the formula indicates which variables are used for grouping, and the right-hand side indicates which variable is used for pivoting. Currently, only a single pivot column is supported.

fun.aggregate

How should the grouped dataset be aggregated? Can be a length-one character vector, giving the name of a Spark aggregation function to be called; a named R list mapping column names to an aggregation method, or an R function that is invoked on the grouped dataset.

Examples

## Not run: 
library(sparklyr)
library(dplyr)

sc <- spark_connect(master = "local")
iris_tbl <- sdf_copy_to(sc, iris, name = "iris_tbl", overwrite = TRUE)

# aggregating by mean
iris_tbl %>%
  mutate(Petal_Width = ifelse(Petal_Width > 1.5, "High", "Low")) %>%
  sdf_pivot(Petal_Width ~ Species,
    fun.aggregate = list(Petal_Length = "mean")
  )

# aggregating all observations in a list
iris_tbl %>%
  mutate(Petal_Width = ifelse(Petal_Width > 1.5, "High", "Low")) %>%
  sdf_pivot(Petal_Width ~ Species,
    fun.aggregate = list(Petal_Length = "collect_list")
  )

## End(Not run)

sparklyr

R Interface to Apache Spark

v1.6.2
Apache License 2.0 | file LICENSE
Authors
Javier Luraschi [aut], Kevin Kuo [aut] (<https://orcid.org/0000-0001-7803-7901>), Kevin Ushey [aut], JJ Allaire [aut], Samuel Macedo [ctb], Hossein Falaki [aut], Lu Wang [aut], Andy Zhang [aut], Yitao Li [aut, cre] (<https://orcid.org/0000-0002-1261-905X>), Jozef Hajnala [ctb], Maciej Szymkiewicz [ctb] (<https://orcid.org/0000-0003-1469-9396>), Wil Davis [ctb], RStudio [cph], The Apache Software Foundation [aut, cph]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.