Write Spark DataFrame to file using a custom writer
Run a custom R function on Spark worker to write a Spark DataFrame into file(s). If Spark's speculative execution feature is enabled (i.e., 'spark.speculation' is true), then each write task may be executed more than once and the user-defined writer function will need to ensure no concurrent writes happen to the same file path (e.g., by appending UUID to each file name).
spark_write(x, writer, paths, packages = NULL)
x |
A Spark Dataframe to be saved into file(s) |
writer |
A writer function with the signature function(partition, path)
where |
paths |
A single destination path or a list of destination paths, each one
specifying a location for a partition from |
packages |
Boolean to distribute |
## Not run: library(sparklyr) sc <- spark_connect(master = "local[3]") # copy some test data into a Spark Dataframe sdf <- sdf_copy_to(sc, iris, overwrite = TRUE) # create a writer function writer <- function(df, path) { write.csv(df, path) } spark_write( sdf, writer, # re-partition sdf into 3 partitions and write them to 3 separate files paths = list("file:///tmp/file1", "file:///tmp/file2", "file:///tmp/file3"), ) spark_write( sdf, writer, # save all rows into a single file paths = list("file:///tmp/all_rows") ) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.