Enable parallelization on batch systems
This class is used to parameterize scheduler options on managed high-performance computing clusters using batchtools.
BatchtoolsParam()
: Construct a BatchtoolsParam-class object.
batchtoolsWorkers()
: Return the default number of workers for
each backend.
batchtoolsTemplate()
: Return the default template for each
backend.
batchtoolsCluster()
: Return the default cluster.
batchtoolsRegistryargs()
: Create a list of arguments to be
used in batchtools' makeRegistry
; see registryargs
argument.
BatchtoolsParam( workers = batchtoolsWorkers(cluster), cluster = batchtoolsCluster(), registryargs = batchtoolsRegistryargs(), saveregistry = FALSE, resources = list(), template = batchtoolsTemplate(cluster), stop.on.error = TRUE, progressbar = FALSE, RNGseed = NA_integer_, timeout = 30L * 24L * 60L * 60L, exportglobals=TRUE, log = FALSE, logdir = NA_character_, resultdir=NA_character_, jobname = "BPJOB" ) batchtoolsWorkers(cluster = batchtoolsCluster()) batchtoolsCluster(cluster) batchtoolsTemplate(cluster) batchtoolsRegistryargs(...)
workers |
|
Number of workers to divide tasks
(e.g., elements in the first argument of bplapply
)
between. On 'multicore' and 'socket' backends, this defaults to
multicoreWorkers()
and snowWorkers()
. On managed
(e.g., slurm, SGE) clusters workers
has no default,
meaning that the number of workers needs to be provided by the
user.
cluster |
|
Cluster type being used as the
backend by BatchtoolsParam
. The available options are
"socket", "multicore", "interactive", "sge", "slurm", "lsf",
"torque" and "openlava". The cluster type if available on the
machine registers as the backend. Cluster types which need
a template
are "sge", "slurm", "lsf", "openlava", and
"torque". If the template is not given then a default is
selected from the batchtools
package.
registryargs |
|
Arguments given to the registry
created by BatchtoolsParam
to configure the registry and
where it's being stored. The registryargs
can be
specified by the function batchtoolsRegistryargs()
which
takes the arguments file.dir
, work.dir
,
packages
, namespaces
, source
, load
,
make.default
. It's useful to configure these option,
especially the file.dir
to a location which is accessible
to all the nodes on your job scheduler i.e master and
workers. file.dir
uses a default setting to make a
registry in your working directory.
saveregistry |
|
Option given to store the
entire registry for the job(s). This functionality should only
be used when debugging. The storage of the entire registry can
be time and space expensive on disk. The registry will be saved
in the directory specified by file.dir
in
registryargs
; the default locatoin is the current working
directory. The saved registry directories will have suffix "-1",
"-2" and so on, for each time the BatchtoolsParam
is
used.
resources |
|
Arguments passed to the
resources
argument of batchtools::submitJobs
during evaluation of bplapply
and similar
functions. These name-value pairs are used for substitution
into the template file.
template |
|
Path to a template for the
backend in BatchtoolsParam
. It is possible to check which
template is being used by the object using the getter
bpbackend(BatchtoolsParam())
. The template needs to be
written specific to each backend. Please check the list of available
templates in the batchtools
package.
stop.on.error |
|
Stop all jobs as soon as one
jobs fails (stop.on.error == TRUE
) or wait for all jobs
to terminate. Default is TRUE
.
progressbar |
|
Suppress the progress bar used
in BatchtoolsParam and be less verbose. Default is
FALSE
.
RNGseed |
|
Set an initial seed for the RNG.
Default is NULL
where a random seed is chosen upon
initialization.
timeout |
|
Time (in seconds) allowed for worker
to complete a task. If the computation exceeds timeout
an error is thrown with message 'reached elapsed time limit'.
exportglobals |
|
Export
base::options()
from manager to workers? Default TRUE
.
log |
|
Option given to save the logs which
are produced by the jobs. If log=TRUE
then the logdir
option must be specified.
logdir |
|
Path to location where logs are
stored. The argument log=TRUE
is required before using the
logdir option.
resultdir |
|
Path where results are stored.
jobname |
|
Job name that is prepended to the output log and result files. Default is "BPJOB".
... |
name-value pairs |
Names and values correspond to arguments from batchtools
makeRegistry
.
Return an object with specified values. The object may be saved to disk or reused within a session.
bplapply
handles arguments X
of classes derived
from S4Vectors::List
specially, coercing to list
.
Nitesh Turaga, mailto:nitesh.turaga@roswellpark.org
getClass("BiocParallelParam")
for additional parameter classes.
register
for registering parameter classes for use in parallel
evaluation.
The batchtools package.
## Pi approximation piApprox = function(n) { nums = matrix(runif(2 * n), ncol = 2) d = sqrt(nums[, 1]^2 + nums[, 2]^2) 4 * mean(d <= 1) } piApprox(1000) ## Calculate piApprox 10 times param <- BatchtoolsParam() result <- bplapply(rep(10e5, 10), piApprox, BPPARAM=param) ## Not run: ## see vignette for additional explanation library(BiocParallel) param = BatchtoolsParam(workers=5, cluster="sge", template="script/test-sge-template.tmpl") ## Run parallel job result = bplapply(rep(10e5, 100), piApprox, BPPARAM=param) ## bpmapply param = BatchtoolsParam() result = bpmapply(fun, x = 1:3, y = 1:3, MoreArgs = list(z = 1), SIMPLIFY = TRUE, BPPARAM = param) ## bpvec param = BatchtoolsParam(workers=2) result = bpvec(1:10, seq_along, BPPARAM=param) ## bpvectorize param = BatchtoolsParam(workers=2) ## this returns a function bpseq_along = bpvectorize(seq_along, BPPARAM=param) result = bpseq_along(1:10) ## bpiterate ITER <- function(n=5) { i <- 0L function() { i <<- i + 1L if (i > n) return(NULL) rep(i, n) } } param <- BatchtoolsParam() res <- bpiterate(ITER=ITER(), FUN=function(x,y) sum(x) + y, y=10, BPPARAM=param) ## save logs logdir <- tempfile() dir.create(logdir) param <- BatchtoolsParam(log=TRUE, logdir=logdir) res <- bplapply(rep(10e5, 10), piApprox, BPPARAM=param) ## save registry (should be used only for debugging) file.dir <- tempfile() registryargs <- batchtoolsRegistryargs(file.dir = file.dir) param <- BatchtoolsParam(saveregistry = TRUE, registryargs = registryargs) res <- bplapply(rep(10e5, 10), piApprox, BPPARAM=param) dir(dirname(file.dir), basename(file.dir)) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.