Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

MALDIquant-parallel

Parallel Support in Package MALDIquant


Description

MALDIquant offers multi-core support using mclapply and mcmapply. This approach is limited to unix-based platforms.

Please note that not all functions benfit from parallelisation. Often the overhead to create/copy objects outrun the time saving of parallel runs. This is true for functions that are very fast to compute (e.g. sqrt-transformation). That's why the default value for the mc.cores argument in all functions is 1L. It depends on the size of the dataset which step (often only removeBaseline and detectPeaks) benefits from parallelisation.
In general it is faster to encapsulate the complete workflow into a function and parallelise it using mclapply instead of using the mc.cores argument of each method. The reason is the reduced overhead for object management (only one split/combine is needed instead of doing these operations in each function again and again).

Details

The following functions/methods support the mc.cores argument:

See Also

Examples

## load package
library("MALDIquant")

## load example data
data("fiedler2009subset", package="MALDIquant")

## run single-core baseline correction
print(system.time(
  b1 <- removeBaseline(fiedler2009subset, method="SNIP")
))

if(.Platform$OS.type == "unix") {
  ## run multi-core baseline correction
  print(system.time(
    b2 <- removeBaseline(fiedler2009subset, method="SNIP", mc.cores=2)
  ))
  stopifnot(all.equal(b1, b2))
}

## parallelise complete workflow
workflow <- function(spectra, cores) {
  s <- transformIntensity(spectra, method="sqrt", mc.cores=cores)
  s <- smoothIntensity(s, method="SavitzkyGolay", halfWindowSize=10,
                       mc.cores=cores)
  s <- removeBaseline(s, method="SNIP", iterations=100, mc.cores=cores)
  s <- calibrateIntensity(s, method="TIC", mc.cores=cores)
  detectPeaks(s, method="MAD", halfWindowSize=20, SNR=2, mc.cores=cores)
}

if(.Platform$OS.type == "unix") {
  ## parallelise the complete workflow is often faster because the overhead is
  ## reduced
  print(system.time(
    p1 <- unlist(parallel::mclapply(fiedler2009subset,
                                    function(x)workflow(list(x), cores=1),
                                    mc.cores=2), use.names=FALSE)
  ))
  print(system.time(
    p2 <- workflow(fiedler2009subset, cores=2)
  ))
  stopifnot(all.equal(p1, p2))
}

MALDIquant

Quantitative Analysis of Mass Spectrometry Data

v1.19.3
GPL (>= 3)
Authors
Sebastian Gibb [aut, cre] (<https://orcid.org/0000-0001-7406-4443>), Korbinian Strimmer [ths] (<https://orcid.org/0000-0001-7917-2056>)
Initial release
2019-05-12

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.