Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

spark_read_delta

Read from Delta Lake into a Spark DataFrame.


Description

Read from Delta Lake into a Spark DataFrame.

Usage

spark_read_delta(
  sc,
  path,
  name = NULL,
  version = NULL,
  timestamp = NULL,
  options = list(),
  repartition = 0,
  memory = TRUE,
  overwrite = TRUE,
  ...
)

Arguments

sc

A spark_connection.

path

The path to the file. Needs to be accessible from the cluster. Supports the "hdfs://", "s3a://" and "file://" protocols.

name

The name to assign to the newly generated table.

version

The version of the delta table to read.

timestamp

The timestamp of the delta table to read. For example, "2019-01-01" or "2019-01-01'T'00:00:00.000Z".

options

A list of strings with additional options.

repartition

The number of partitions used to distribute the generated table. Use 0 (the default) to avoid partitioning.

memory

Boolean; should the data be loaded eagerly into memory? (That is, should the table be cached?)

overwrite

Boolean; overwrite the table with the given name if it already exists?

...

Optional arguments; currently unused.

See Also


sparklyr

R Interface to Apache Spark

v1.6.2
Apache License 2.0 | file LICENSE
Authors
Javier Luraschi [aut], Kevin Kuo [aut] (<https://orcid.org/0000-0001-7803-7901>), Kevin Ushey [aut], JJ Allaire [aut], Samuel Macedo [ctb], Hossein Falaki [aut], Lu Wang [aut], Andy Zhang [aut], Yitao Li [aut, cre] (<https://orcid.org/0000-0002-1261-905X>), Jozef Hajnala [ctb], Maciej Szymkiewicz [ctb] (<https://orcid.org/0000-0003-1469-9396>), Wil Davis [ctb], RStudio [cph], The Apache Software Foundation [aut, cph]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.