Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

superpc.decorrelate

Decorrelate features with respect to competing predictors


Description

Fits a linear model to the features as a function of some competing predictors. Replaces the features by the residual from this fit. These "decorrelated" features are then used in the superpc model building process, to explicitly look for predictors that are independent of the competing predictors. Useful for example, when the competing predictors are clinical predictors like stage, grade etc.

Usage

superpc.decorrelate(x, 
                        competing.predictors)

Arguments

x

matrix of features. Different features in different rows, one observation per column

competing.predictors

List of one or more competing predictors. Discrete predictors should be factors

Value

Returns lm (linear model) fit of rows of x on compeiting predictors.

Author(s)

  • "Eric Bair, Ph.D."

  • "Jean-Eudes Dazard, Ph.D."

  • "Rob Tibshirani, Ph.D."

Maintainer: "Jean-Eudes Dazard, Ph.D."

References

  • E. Bair and R. Tibshirani (2004). "Semi-supervised methods to predict patient survival from gene expression data." PLoS Biol, 2(4):e108.

  • E. Bair, T. Hastie, D. Paul, and R. Tibshirani (2006). "Prediction by supervised principal components." J. Am. Stat. Assoc., 101(473):119-137.

Examples

set.seed(332)

#generate some data
x <- matrix(rnorm(50*30), ncol=30)
y <- 10 + svd(x[1:50,])$v[,1] + .1*rnorm(30)
censoring.status <- sample(c(rep(1,20), rep(0,10)))

featurenames <- paste("feature", as.character(1:50), sep="")
competing.predictors <- list(pred1=rnorm(30), 
                             pred2=as.factor(sample(c(1,2), 
                                             replace=TRUE, 
                                             size=30)))

#decorrelate x. Remember to decorrelate test data in the same way, before making predictions.
foo <- superpc.decorrelate(x, competing.predictors)
xnew <- t(foo$res)

#now use xnew in superpc
data <- list(x=xnew, 
             y=y, 
             censoring.status=censoring.status, 
             featurenames=featurenames)
a <- superpc.train(data, type="survival")

#etc.

superpc

Supervised Principal Components

v1.12
GPL (>= 3) | file LICENSE
Authors
Eric Bair [aut], Jean-Eudes Dazard [cre, ctb], Rob Tibshirani [ctb]
Initial release
2020-10-19

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.