Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

poolvar

Pool Sample Variances with Unequal Variances


Description

Compute the Satterthwaite (1946) approximation to the distribution of a weighted sum of sample variances.

Usage

poolVar(var, df=n-1, multiplier=1/n, n)

Arguments

var

numeric vector of independent sample variances

df

numeric vector of degrees of freedom for the sample variances

multiplier

numeric vector giving multipliers for the sample variances

n

numeric vector of sample sizes

Details

The sample variances var are assumed to follow scaled chi-square distributions. A scaled chi-square approximation is found for the distribution of sum(multiplier * var) by equating first and second moments. On output the sum to be approximated is equal to multiplier * var which follows approximately a scaled chisquare distribution on df degrees of freedom. The approximation was proposed by Satterthwaite (1946).

If there are only two groups and the degrees of freedom are one less than the sample sizes then this gives the denominator of Welch's t-test for unequal variances.

Value

A list with components

var

effective pooled sample variance

df

effective pooled degrees of freedom

multiplier

pooled multiplier

Author(s)

Gordon Smyth

References

Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal. Biometrika 29, 350-362.

Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin 2, 110-114.

Welch, B. L. (1947). The generalization of 'Student's' problem when several different population variances are involved. Biometrika 34, 28-35.

Welch, B. L. (1949). Further note on Mrs. Aspin's tables and on certain approximations to the tabled function. Biometrika 36, 293-296.

Examples

#  Welch's t-test with unequal variances
x <- rnorm(10,mean=1,sd=2)
y <- rnorm(20,mean=2,sd=1)
s2 <- c(var(x),var(y))
n <- c(10,20)
out <- poolVar(var=s2,n=n)
tstat <- (mean(x)-mean(y)) / sqrt(out$var*out$multiplier)
pvalue <- 2*pt(-abs(tstat),df=out$df)
#  Equivalent to t.test(x,y)

limma

Linear Models for Microarray Data

v3.46.0
GPL (>=2)
Authors
Gordon Smyth [cre,aut], Yifang Hu [ctb], Matthew Ritchie [ctb], Jeremy Silver [ctb], James Wettenhall [ctb], Davis McCarthy [ctb], Di Wu [ctb], Wei Shi [ctb], Belinda Phipson [ctb], Aaron Lun [ctb], Natalie Thorne [ctb], Alicia Oshlack [ctb], Carolyn de Graaf [ctb], Yunshun Chen [ctb], Mette Langaas [ctb], Egil Ferkingstad [ctb], Marcus Davy [ctb], Francois Pepin [ctb], Dongseok Choi [ctb]
Initial release
2020-10-19

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.