Compute Summary Statistics on a Vector
A number of statistical summary functions is provided for use
with summary.formula
and summarize
(as well as
tapply
and by themselves).
smean.cl.normal
computes 3 summary variables: the sample mean and
lower and upper Gaussian confidence limits based on the t-distribution.
smean.sd
computes the mean and standard deviation.
smean.sdl
computes the mean plus or minus a constant times the
standard deviation.
smean.cl.boot
is a very fast implementation of the basic
nonparametric bootstrap for obtaining confidence limits for the
population mean without assuming normality.
These functions all delete NAs automatically.
smedian.hilow
computes the sample median and a selected pair of
outer quantiles having equal tail areas.
smean.cl.normal(x, mult=qt((1+conf.int)/2,n-1), conf.int=.95, na.rm=TRUE) smean.sd(x, na.rm=TRUE) smean.sdl(x, mult=2, na.rm=TRUE) smean.cl.boot(x, conf.int=.95, B=1000, na.rm=TRUE, reps=FALSE) smedian.hilow(x, conf.int=.95, na.rm=TRUE)
x |
for summary functions |
na.rm |
defaults to |
mult |
for |
conf.int |
for |
B |
number of bootstrap resamples for |
reps |
set to |
a vector of summary statistics
Frank Harrell
Department of Biostatistics
Vanderbilt University
fh@fharrell.com
set.seed(1) x <- rnorm(100) smean.sd(x) smean.sdl(x) smean.cl.normal(x) smean.cl.boot(x) smedian.hilow(x, conf.int=.5) # 25th and 75th percentiles # Function to compute 0.95 confidence interval for the difference in two means # g is grouping variable bootdif <- function(y, g) { g <- as.factor(g) a <- attr(smean.cl.boot(y[g==levels(g)[1]], B=2000, reps=TRUE),'reps') b <- attr(smean.cl.boot(y[g==levels(g)[2]], B=2000, reps=TRUE),'reps') meandif <- diff(tapply(y, g, mean, na.rm=TRUE)) a.b <- quantile(b-a, c(.025,.975)) res <- c(meandif, a.b) names(res) <- c('Mean Difference','.025','.975') res }
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.