Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

superpc.predict.red

Feature selection for supervised principal components


Description

Forms reduced models to approximate the supervised principal component predictor.

Usage

superpc.predict.red(fit, 
                        data, 
                        data.test, 
                        threshold, 
                        n.components=3, 
                        n.shrinkage=20, 
                        shrinkages=NULL,
                        compute.lrtest=TRUE,
                        sign.wt="both",
                        prediction.type=c("continuous", "discrete"), 
                        n.class=2)

Arguments

fit

Object returned by superpc.train

data

Training data object, of form described in superpc.train dcoumentation

data.test

Test data object; same form as train

threshold

Feature score threshold; usually estimated from superpc.cv

n.components

Number of principal components to examine; should equal 1,2, etc up to the number of components used in training

n.shrinkage

Number of shrinkage values to consider. Default 20.

shrinkages

Shrinkage values to consider. Default NULL.

compute.lrtest

Should the likelihood ratio test be computed? Default TRUE

sign.wt

Signs of feature weights allowed: "both", "pos", or "neg"

prediction.type

Type of prediction: "continuous" (Default) or "discrete". In the latter, superprc score is divided into n.class groups

n.class

Number of groups for discrete predictor. Default 2.

Details

Soft-thresholding by each of the "shrinkages" values is applied to the PC loadings. This reduce the number of features used in the model. The reduced predictor is then used in place of the supervised PC predictor.

Value

shrinkages

Shrinkage values used

lrtest.reduced

Likelihood ratio tests for reduced models

num.features

Number of features used in each reduced model

feature.list

List of features used in each reduced model

coef

Least squares coefficients for each reduced model

import

Importance scores for features

wt

Weight for each feature, in constructing the reduced predictor

v.test

Outcome predictor from reduced models. Array of n.shrinkage by (number of test observations)

v.test.1df

Outcome combined predictor from reduced models. Array of n.shrinkage by (number of test observations)

n.components

Number of principal components used

type

Type of outcome

call

calling sequence

Author(s)

  • "Eric Bair, Ph.D."

  • "Jean-Eudes Dazard, Ph.D."

  • "Rob Tibshirani, Ph.D."

Maintainer: "Jean-Eudes Dazard, Ph.D."

References

  • E. Bair and R. Tibshirani (2004). "Semi-supervised methods to predict patient survival from gene expression data." PLoS Biol, 2(4):e108.

  • E. Bair, T. Hastie, D. Paul, and R. Tibshirani (2006). "Prediction by supervised principal components." J. Am. Stat. Assoc., 101(473):119-137.

Examples

set.seed(332)

#generate some data
x <- matrix(rnorm(50*30), ncol=30)
y <- 10 + svd(x[1:50,])$v[,1] + .1*rnorm(30)
ytest <- 10 + svd(x[1:50,])$v[,1] + .1*rnorm(30)
censoring.status <- sample(c(rep(1,20), rep(0,10)))
censoring.status.test <- sample(c(rep(1,20), rep(0,10)))

featurenames <- paste("feature", as.character(1:50), sep="")
data <- list(x=x,
             y=y, 
             censoring.status=censoring.status, 
             featurenames=featurenames)
data.test <- list(x=x, 
                  y=ytest, 
                  censoring.status=censoring.status.test, 
                  featurenames=featurenames)

a <- superpc.train(data, type="survival")
fit.red <- superpc.predict.red(a,
                               data, 
                               data.test, 
                               threshold=.6)
superpc.plotred.lrtest(fit.red)

superpc

Supervised Principal Components

v1.12
GPL (>= 3) | file LICENSE
Authors
Eric Bair [aut], Jean-Eudes Dazard [cre, ctb], Rob Tibshirani [ctb]
Initial release
2020-10-19

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.