Iterative filtering of samples and genes with too many missing entries
This function checks data for missing entries, entries with weights below a threshold, and zero-variance genes, and returns a list of samples and genes that pass criteria on maximum number of missing or low weight values. If necessary, the filtering is iterated.
goodSamplesGenes( datExpr, weights = NULL, minFraction = 1/2, minNSamples = ..minNSamples, minNGenes = ..minNGenes, tol = NULL, minRelativeWeight = 0.1, verbose = 1, indent = 0)
datExpr |
expression data. A matrix or data frame in which columns are genes and rows ar samples. |
weights |
optional observation weights in the same format (and dimensions) as |
minFraction |
minimum fraction of non-missing samples for a gene to be considered good. |
minNSamples |
minimum number of non-missing samples for a gene to be considered good. |
minNGenes |
minimum number of good genes for the data set to be considered fit for analysis. If the actual number of good genes falls below this threshold, an error will be issued. |
tol |
an optional 'small' number to compare the variance against. Defaults to the square of
|
minRelativeWeight |
observations whose relative weight is below this threshold will be considered missing. Here relative weight is weight divided by the maximum weight in the column (gene). |
verbose |
integer level of verbosity. Zero means silent, higher values make the output progressively more and more verbose. |
indent |
indentation for diagnostic messages. Zero means no indentation, each unit adds two spaces. |
This function iteratively identifies samples and genes with too many missing entries and genes with
zero variance. If weights are given, entries with relative weight (weight divided by maximum weight in the
column) below minRelativeWeight
will be considered missing. The process is
repeated until the lists of good samples and genes are stable.
The constants ..minNSamples
and ..minNGenes
are both set to the value 4.
A list with the foolowing components:
goodSamples |
A logical vector with one entry per sample that is |
goodGenes |
A logical vector with one entry per gene that is |
Peter Langfelder
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.