Reduce Operation for Batch Systems
A parallel and asynchronous Reduce
for batch systems.
Note that this function only defines the computational jobs.
Each job reduces a certain number of elements on one slave.
The actual computation is started with submitJobs
.
Results and partial results can be collected with reduceResultsList
, reduceResults
or
loadResult
.
batchReduce( fun, xs, init = NULL, chunks = seq_along(xs), more.args = list(), reg = getDefaultRegistry() )
fun |
[ |
xs |
[ |
init |
[ANY] |
chunks |
[ |
more.args |
[ |
reg |
[ |
[data.table
] with ids of added jobs stored in column “job.id”.
# define function to reduce on slave, we want to sum a vector tmp = makeRegistry(file.dir = NA, make.default = FALSE) xs = 1:100 f = function(aggr, x) aggr + x # sum 20 numbers on each slave process, i.e. 5 jobs chunks = chunk(xs, chunk.size = 5) batchReduce(fun = f, 1:100, init = 0, chunks = chunks, reg = tmp) submitJobs(reg = tmp) waitForJobs(reg = tmp) # now reduce one final time on master reduceResults(fun = function(aggr, job, res) f(aggr, res), reg = tmp)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.