Apply a function to a stream of RecordBatches
As an alternative to calling collect()
on a Dataset
query, you can
use this function to access the stream of RecordBatch
es in the Dataset
.
This lets you aggregate on each chunk and pull the intermediate results into
a data.frame
for further aggregation, even if you couldn't fit the whole
Dataset
result in memory.
map_batches(X, FUN, ..., .data.frame = TRUE)
X |
A |
FUN |
A function or |
... |
Additional arguments passed to |
.data.frame |
logical: collect the resulting chunks into a single
|
This is experimental and not recommended for production use.
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.