ClustVarLV: lm_CLV – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

lm_CLV

linear model based on CLV

Description

prediction of a response variable, y, based on clusters of predictors variables, X. boosted-liked procedure for identifying groups of predictors, and their associated latent component, well correlated with the actual residuals of response variable, y. sparsity is allowed using the strategy options ("sparselv" or "kplusone") and the rho parameter.

Usage

lm_CLV(X, y, method = "directional", sX = TRUE, shrinkp = 0.5,
  strategy = "none", rho = 0.3, validation = FALSE, id.test = NULL,
  maxiter = 100, threshold = 1e-05)

Arguments

`X`	: The matrix of the predictors, the variables to be clustered
`y`	: The response variable (usually numeric) If y is binary factor, indicator variable (0/1) is generated. A Bayes rule is used to compute class probabilities. Performance criteria is RMSE for numerical variable; RMSE and error rate for binary factor.
`method`	: The criterion to be use in the cluster analysis. 1 or "directional" : the squared covariance is used as a measure of proximity (directional groups). 2 or "local" : the covariance is used as a measure of proximity (local groups)
`sX`	: TRUE/FALSE, i.e. standardization or not of the columns X (TRUE by default)
`shrinkp`	: shrinkage paramater used in the boosting (max : 1, 0.5 by default). If shrinkp is a vector of positive values greater than 0, and lower or equal to 1, the outputs are given for each value.
`strategy`	: "none" (by default), or "kplusone" (an additional cluster for the unclassifiable variables), or "sparselv" (zero loadings for the unclassifiable variables)
`rho`	: a threshold of correlation between 0 and 1 (used in "kplusone" or "sparselv" strategy, 0.3 by default)
`validation`	TRUE/FALSE i.e. using a test set or not. By default no validation
`id.test`	: if validation==TRUE, the number of the observations used as test set
`maxiter`	: the maximum number of components extracted (100 by default)
`threshold`	: used in a stopping rule, when the relative calibration errors sum of squares stabilizes (10e-6 by default)

Value

`Group`	a list of the groups of variables X in order of the first time extracted.
`Comp`	a list of the latent components associated with the groups of X variables extracted.
`Load`	a list for the loadings of the X variables in the latent component.
`Alpha`	a list of the regression coefficients to be applied to the latent components. The coefficients are aggregated when the same latent component is extracted several times during the iterative steps.
`Beta`	a list of the beta coefficients to be applied to the pretreated predictors. For a model with the A first latent components, the A first elements of the list must be added together.
`GroupImp`	Group Importance i.e. the decrease of the residuals' variance provided by the CLV components in the model.
`RMSE.cal`	the root mean square error for the calibration set, at each step of the procedure.
`ERRrate.cal and rocAUC.cal`	when y is a binary factor, the classification rate and the AUC for ROC, on the bassis of the calibration set, at each step of the procedure.
`RMSE.val`	as RMSE.cal but for the test set, if provided.
`ERRrate.val and rocAUC.val`	as for calibration set but for the test set, if provided.

lm_CLV

Description

Usage

Arguments

Value

See Also

ClustVarLV

We don't support your browser anymore