Hierarchical consensus calculation
Hierarchical consensus calculation with optional data calibration.
hierarchicalConsensusCalculation( individualData, consensusTree, level = 1, useBlocks = NULL, randomSeed = NULL, saveCalibratedIndividualData = FALSE, calibratedIndividualDataFilePattern = "calibratedIndividualData-%a-Set%s-Block%b.RData", # Return options: the data can be either saved or returned but not both. saveConsensusData = TRUE, consensusDataFileNames = "consensusData-%a-Block%b.RData", getCalibrationSamples= FALSE, # Return the intermediate results as well? keepIntermediateResults = FALSE, # Internal handling of data useDiskCache = NULL, chunkSize = NULL, cacheDir = ".", cacheBase = ".blockConsModsCache", # Behaviour collectGarbage = FALSE, verbose = 1, indent = 0)
individualData |
Individual data from which the consensus is to be calculated. It can be either a list or a
|
consensusTree |
A list specifying the consensus calculation. See details. |
level |
Integer which the user should leave at 1. This serves to keep default set names unique. |
useBlocks |
When |
randomSeed |
If non- |
saveCalibratedIndividualData |
Logical: should calibrated individual data be saved? |
calibratedIndividualDataFilePattern |
Pattern from which file names for saving calibrated individual data are determined. The conversions
|
saveConsensusData |
Logical: should final consensus be saved ( |
consensusDataFileNames |
Pattern from which file names for saving the final consensus are determined. The conversions
|
getCalibrationSamples |
When calibration method in the |
keepIntermediateResults |
Logical: should results of intermediate consensus calculations (if any) be kept? These are always returned
as |
useDiskCache |
Logical: should disk cache be used for consensus calculations? The disk cache can be used to store chunks of
calibrated data that are small enough to fit one chunk from each set into memory (blocks may be small enough
to fit one block of one set into memory, but not small enough to fit one block from all sets in a consensus
calculation into memory at the same time). Using disk cache is slower but lessens the memory footprint of
the calculation.
As a general guide, if individual data are split into blocks, we
recommend setting this argument to |
chunkSize |
Integer giving the chunk size. If left |
cacheDir |
Directory in which to save cache files. The files are deleted on normal exit but persist if the function terminates abnormally. |
cacheBase |
Base for the file names of cache files. |
collectGarbage |
Logical: should garbage collection be forced after each major calculation? |
verbose |
Integer level of verbosity of diagnostic messages. Zero means silent, higher values make the output progressively more and more verbose. |
indent |
Indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces. |
This function calculates consensus in a hierarchical manner, using a separate (and possibly different) set of
consensus options at each step. The "recipe" for the consensus calculation is supplied in the argument
consensusTree
.
The argument consensusTree
should have the following components: (1) inputs
must be either a
character vector whose components match names(inputData)
, or consensus trees in the own right.
(2) consensusOptions
must be a list of class "ConsensusOptions"
that specifies options for
calculating the consensus. A suitable set of options can be obtained by calling
newConsensusOptions
. (3) Optionally, the component analysisName
can be a single
character string giving the name for the analysis. When intermediate results are returned, they are returned
in a list whose names will be set from analysisName
components, if they exist.
The actual consensus calculation at each level of the consensus tree
is carried out in function consensusCalculation
. The consensus options for each individual
consensus calculation are independent from one another, i.e., the consensus options for different steps can
be different.
A list containing the output of the top level call to consensusCalculation
; if
keepIntermediateResults
is TRUE
, component inputs
contains a (possibly recursive) list
of the results of intermediate consensus calculations. Names of the inputs
list are taken from the
corresponding analysisName
components if they exist, otherwise from names of the corresponding
inputs
components of the supplied consensusTree
. See example below for an example of a
relatively simple consensus tree.
Peter Langfelder
newConsensusOptions
for obtaining a suitable list of consensus options;
consensusCalculation
for the actual calculation of a consensus that underpins this function.
# We generate 3 simple matrices set.seed(5) data = replicate(3, matrix(rnorm(10*100), 10, 100)) names(data) = c("Set1", "Set2", "Set3"); # Put together a consensus tree. In this example the final consensus uses # as input set 1 and a consensus of sets 2 and 3. # First define the consensus of sets 2 and 3: consTree.23 = newConsensusTree( inputs = c("Set2", "Set3"), consensusOptions = newConsensusOptions(calibration = "none", consensusQuantile = 0.25), analysisName = "Consensus of sets 1 and 2"); # Now define the final consensus consTree.final = newConsensusTree( inputs = list("Set1", consTree.23), consensusOptions = newConsensusOptions(calibration = "full quantile", consensusQuantile = 0), analysisName = "Final consensus"); consensus = hierarchicalConsensusCalculation( individualData = data, consensusTree = consTree.final, saveConsensusData = FALSE, keepIntermediateResults = FALSE) names(consensus)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.