Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

meanX

Functions to create test statistic closures and apply them to data


Description

The package multtest uses closures in the function MTP to compute test statistics. The closure used depends on the value of the argument test. These functions create the closures for different tests, given any additional variables, such as outcomes or covariates. The function get.Tn calls wapply to apply one of these closures to observed data (and possibly weights).

One exception for how test statistics are calculated in multtest involve tests of correlation parameters, where the change of dimensionality between the p variables in X and the p-choose-2 hypotheses corresponding to the number of pairwise correlations presents a challenge. In this case, the test statistics are calculated directly in corr.Tn and returned in a manner similar to the test statistic function closures. No resampling is done either, since the null distribution for tests of correlation parameters are only implemented when nulldist='ic'. Details are given below.

Usage

meanX(psi0 = 0, na.rm = TRUE, standardize = TRUE, 
alternative = "two.sided", robust = FALSE)

diffmeanX(label, psi0 = 0, var.equal = FALSE, na.rm = TRUE, 
standardize = TRUE, alternative = "two.sided", robust = FALSE)

FX(label, na.rm = TRUE, robust = FALSE)

blockFX(label, na.rm = TRUE, robust = FALSE)

twowayFX(label, na.rm = TRUE, robust = FALSE)

lmX(Z = NULL, n, psi0 = 0, na.rm = TRUE, standardize = TRUE, 
alternative = "two.sided", robust = FALSE)

lmY(Y, Z = NULL, n, psi0 = 0, na.rm = TRUE, standardize = TRUE, 
alternative = "two.sided", robust = FALSE)

coxY(surv.obj, strata = NULL, psi0 = 0, na.rm = TRUE, standardize = TRUE, 
alternative = "two.sided", init = NULL, method = "efron")

get.Tn(X, stat.closure, W = NULL)

corr.Tn(X, test, alternative, use = "pairwise")

Arguments

X

A matrix, data.frame or ExpressionSet containing the raw data. In the case of an ExpressionSet, exprs(X) is the data of interest and pData(X) may contain outcomes and covariates of interest. For currently implemented tests, one hypothesis is tested for each row of the data.

W

A vector or matrix containing non-negative weights to be used in computing the test statistics. If a matrix, W must be the same dimension as X with one weight for each value in X. If a vector, W may contain one weight for each observation (i.e. column) of X or one weight for each variable (i.e. row) of X. In either case, the weights are duplicated apporpraiately. Weighted f-tests are not available. Default is 'NULL'.

label

A vector containing the class labels for t- and f-tests. For the blockFX function, observations are divided into l blocks of n/l observations. Within each block there may be k groups with k>2. For this test, there is only one observation per block*group combination. The labels (and corresponding rows of Z and columns of X and W) must be ordered by block and within each block ordered by group. Groups must be labeled with integers 1,...,k. For the twowayFX function, observations are divided into l blocks. Within each block there may be k groups with k>2. There must be more than one observation per group*block combination for this test. The labels (and corresponding rows of Z and columns of X and W) must be ordered by block and within each block ordered by group. Groups must be labeled with integers 1,...,k.

Y

A vector or factor containing the outcome of interest for linear models. This may be a continuous or polycotomous dependent variable.

surv.object

A survival object as returned by the Surv function, to be used as response in coxY.

Z

A vector, factor, or matrix containing covariate data to be used in the linear regression models. Each variable should be in one column.

strata

A vector, factor, or matrix containing covariate data to be used in the Cox regression models. Covariate data will be converted to a factor variable (via the strata function) for use in the coxph function. Each variable should be in one column.

n

The sample size, e.g. length(Y) or nrow(Z).

psi0

Hypothesized null value for the parameter of interest (e.g. mean or difference in means), typically zero (default).

var.equal

Indicator of whether to use t-statistics that assume equal variance in the two groups when computing the denominator of the test statistics.

na.rm

Logical indicating whether to remove observations with an NA. Default is 'TRUE'.

standardize

Logical indicating whether to use the standardized version of the test statistics (usual t-statistics are standardized). Default is 'TRUE'.

alternative

Character string indicating the alternative hypotheses, by default 'two.sided'. For one-sided tests, use 'less' or 'greater' for null hypotheses of 'greater than or equal' (i.e. alternative is 'less') and 'less than or equal', respectively.

robust

Logical indicating whether to use robust versions of the test statistics.

init

Vector of initial values of the iteration in coxY function, as used in coxph in the survival package. Default initial value is zero for all variables (init=NULL).

method

A character string specifying the method for tie handling in coxY function, as used in coxph in the survival package. Default is "efron".

test

For corr.Tn, a character string of either 't.cor' or 'z.cor' indicating whether t-statistics or Fisher's z-statistics are to be calculated when probing hypotheses involving correlation parameters.

use

Similar to the options in cor, a character string giving a method for computing covariances in the presence of missing values. Default is 'pairwise', which allows for the covariance/correlation matrix to be calculated using the most information possible when NAs are present.

Details

The use of closures, in the style of the genefilter package, allows uniform data input for all MTPs and facilitates the extension of the package's functionality by adding, for example, new types of test statistics. Specifically, for each value of the MTP argument test, a closure is defined which consists of a function for computing the test statistic (with only two arguments, a data vector x and a corresponding weight vector w, with default value of NULL) and its enclosing environment, with bindings for relevant additional arguments. These arguments may include null values psi0, outcomes (Y, label, surv.object), and covariates Z. The vectors x and w are rows of the matrices X and W.

In the MTP function, the closure is first used to compute the vector of observed test statistics, and then, in each bootstrap iteration, to produce the estimated joint null distribution of the test statistics. In both cases, the function get.Tn is used to apply the closure to rows of the matrices of data (X) and weights (W). Thus, new test statistics can be added to multtest package by simply defining a new closure and adding a corresponding value for the test argument to the MTP function.

As mentioned above, one exception made to the closure rule in multtest was done for the case of tests involving correlation parameters (i.e., when test='t.cor' or test='z.cor'). In particular, the change of dimension between the number of variables in X and the number of hypotheses corresponding to all pairwise correlation parameters presented a challenge. In this setting, a 'closure-like' function was written which returns choose(dim(X)[2],2) test statistics stored in a matrix obs described below. No resampling methods are available for 't.cor' and 'z.cor', as their only current available null distribution is based on influence curves (nulldist='ic'), meaning that the test statistics null distribution is sampled directly from an appropriate multivariate normal distribution. In this manner, the data are used to calculate test statistics and null distribution estimates of the appropriate length and dimension, with sidedness correctly accounted for. With care, these objects for tests of correlation can then be integrated into the rest of the (modular) multtest functionality to perform multiple testing using other available argument options in the functions MTP or EBMTP.

Value

For meanX, diffmeanX, FX, blockFX, twowayFX, lmX, lmY, and coxY, a closure consisting of a function for computing test statistics and its enclosing environment. For get.Tn and corr.Tn, the observed test statistics stored in a matrix obs with numerator (possibly absolute value or negative, depending on the value of alternative) in the first row, denominator in the second row, and a 1 or -1 in the third row (depending on the value of alternative). The vector of observed test statistics is obs[1,]*obs[3,]/obs[2,].

Author(s)

Katherine S. Pollard, Houston N. Gilbert, and Sandra Taylor, with design contributions from Duncan Temple Lang, Sandrine Dudoit and Mark J. van der Laan

See Also

Examples

data<-matrix(rnorm(200),nr=20)
#one-sample t-statistics
ttest<-meanX(psi0=0,na.rm=TRUE,standardize=TRUE,alternative="two.sided",robust=FALSE)
obs<-wapply(data,1,ttest,W=NULL)
statistics<-obs[1,]*obs[3,]/obs[2,]
statistics

#for tests of correlation parameters,
#note change of dimension compared to dim(data),
#function calculate statistics directly in same form as above
obs <- corr.Tn(data,test="t.cor",alternative="greater")
dim(obs)
statistics<-obs[1,]*obs[3,]/obs[2,]
length(statistics)

#two-way F-statistics
FData <- matrix(rnorm(5*60),nr=5)
label<-rep(c(rep(1,10), rep(2,10), rep(3,10)),2)
twowayf<-twowayFX(label)
obs<-wapply(FData,1,twowayf,W=NULL)
statistics<-obs[1,]*obs[3,]/obs[2,]
statistics

multtest

Resampling-based multiple hypothesis testing

v2.46.0
LGPL
Authors
Katherine S. Pollard, Houston N. Gilbert, Yongchao Ge, Sandra Taylor, Sandrine Dudoit
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.