Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

colMedians

Fast Row or Column-wise Medians of a Matrix


Description

Calculates the median for each row (column) of a matrix x. This is the same as but more efficient than apply(x, MM, median) for MM=2 or MM=1, respectively.

Usage

colMedians(x, na.rm = FALSE, hasNA = TRUE, keep.names=TRUE)
rowMedians(x, na.rm = FALSE, hasNA = TRUE, keep.names=TRUE)

Arguments

x

a numeric n x p matrix.

na.rm

if TRUE, NAs are excluded first, otherwise not.

hasNA

logical indicating if x may contain NAs. If set to FALSE, no internal NA handling is performed which typically is faster.

keep.names

logical indicating if row or column names of x should become names of the result - as is the case for apply(x, MM, median).

Details

The implementation of rowMedians() and colMedians() is optimized for both speed and memory. To avoid coercing to doubles (and hence memory allocation), there is a special implementation for integer matrices. That is, if x is an integer matrix, then rowMedians(as.double(x)) (rowMedians(as.double(x))) would require three times the memory of rowMedians(x) (colMedians(x)), but all this is avoided.

Value

a numeric vector of length n or p, respectively.

Missing values

Missing values are excluded before calculating the medians unless hasNA is false. Note that na.rm has no effect and is automatically false when hasNA is false, i.e., internally, before computations start, the following is executed:

if (!hasNA)        ## If there are no NAs, don't try to remove them
     narm <- FALSE

Author(s)

Henrik Bengtsson, Harris Jaffee, Martin Maechler

See Also

See wgt.himedian() for a weighted hi-median, and colWeightedMedians() etc from package matrixStats for weighted medians.
For mean estimates, see rowMeans() in colSums().

Examples

set.seed(1); n <- 234; p <- 543 # n*p = 127'062
x <- matrix(rnorm(n*p), n, p)
x[sample(seq_along(x), size= n*p / 256)] <- NA
R1 <- system.time(r1 <- rowMedians(x, na.rm=TRUE))
C1 <- system.time(y1 <- colMedians(x, na.rm=TRUE))
R2 <- system.time(r2 <- apply(x, 1, median, na.rm=TRUE))
C2 <- system.time(y2 <- apply(x, 2, median, na.rm=TRUE))
R2 / R1 # speedup factor: ~= 4   {platform dependent}
C2 / C1 # speedup factor: ~= 5.8 {platform dependent}
stopifnot(all.equal(y1, y2, tol=1e-15),
          all.equal(r1, r2, tol=1e-15))

(m <- cbind(x1=3, x2=c(4:1, 3:4,4)))
stopifnot(colMedians(m) == 3,
          all.equal(colMeans(m), colMedians(m)),# <- including names !
          all.equal(rowMeans(m), rowMedians(m)))

robustbase

Basic Robust Statistics

v0.93-7
GPL (>= 2)
Authors
Martin Maechler [aut, cre] (<https://orcid.org/0000-0002-8685-9910>), Peter Rousseeuw [ctb] (Qn and Sn), Christophe Croux [ctb] (Qn and Sn), Valentin Todorov [aut] (most robust Cov), Andreas Ruckstuhl [aut] (nlrob, anova, glmrob), Matias Salibian-Barrera [aut] (lmrob orig.), Tobias Verbeke [ctb, fnd] (mc, adjbox), Manuel Koller [aut] (mc, lmrob, psi-func.), Eduardo L. T. Conceicao [aut] (MM-, tau-, CM-, and MTL- nlrob), Maria Anna di Palma [ctb] (initial version of Comedian)
Initial release
2021-01-04

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.