Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

spark_auto_broadcast_join_threshold

Retrieves or sets the auto broadcast join threshold


Description

Configures the maximum size in bytes for a table that will be broadcast to all worker nodes when performing a join. By setting this value to -1 broadcasting can be disabled. Note that currently statistics are only supported for Hive Metastore tables where the command 'ANALYZE TABLE <tableName> COMPUTE STATISTICS noscan' has been run, and file-based data source tables where the statistics are computed directly on the files of data.

Usage

spark_auto_broadcast_join_threshold(sc, threshold = NULL)

Arguments

sc

A spark_connection.

threshold

Maximum size in bytes for a table that will be broadcast to all worker nodes when performing a join. Defaults to NULL to retrieve configuration entries.

See Also


sparklyr

R Interface to Apache Spark

v1.6.2
Apache License 2.0 | file LICENSE
Authors
Javier Luraschi [aut], Kevin Kuo [aut] (<https://orcid.org/0000-0001-7803-7901>), Kevin Ushey [aut], JJ Allaire [aut], Samuel Macedo [ctb], Hossein Falaki [aut], Lu Wang [aut], Andy Zhang [aut], Yitao Li [aut, cre] (<https://orcid.org/0000-0002-1261-905X>), Jozef Hajnala [ctb], Maciej Szymkiewicz [ctb] (<https://orcid.org/0000-0003-1469-9396>), Wil Davis [ctb], RStudio [cph], The Apache Software Foundation [aut, cph]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.