Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

LMDC.select

Impact points selection of functional predictor and regression using local maxima distance correlation (LMDC)


Description

LMDC.select function selects impact points of functional predictior using local maxima distance correlation (LMDC) for a scalar response given.
LMDC.regre function fits a multivariate regression method using the selected impact points like covariates for a scalar response.

Usage

LMDC.select(
  y,
  covar,
  data,
  tol = 0.06,
  pvalue = 0.05,
  plot = FALSE,
  local.dc = TRUE,
  smo = FALSE,
  verbose = FALSE
)

LMDC.regre(
  y,
  covar,
  data,
  newdata,
  pvalue = 0.05,
  method = "lm",
  par.method = NULL,
  plot = FALSE,
  verbose = FALSE
)

Arguments

y

name of the response variable.

covar

vector with the names of the covaviables (or points of impact) with length p.

data

data frame with length n rows and at least p + 1 columns, containing the scalar response and the potencial p covaviables (or points of impact) in the model.

tol

Tolerance value for distance correlation and imapct point.

pvalue

pvalue of bias corrected distance correlation t-test.

plot

logical value, if TRUE plots the distance correlation curve for each covariate in multivariate case and in each discretization points (argvals) in the functional case.

local.dc

Compute local distance correlation.

smo

logical. If TRUE, the curve of distance correlation computed in the impact points is smoothed using B-spline representation with a suitable number of basis elements.

verbose

print iterative and relevant steps of the procedure.

newdata

An optional data frame in which to look for variables with which to predict.

method

Name of regression method used, see details. This argument is used in do.call function like "what" argument.

par.method

List of parameters used to call the method. This argument is used in do.call function like "args" argument.

Details

String of characters corresponding to the name of the regression method called. Model available options:

  • "lm": Step-wise lm regression model (uses lm function, stats package). Recommended for linear models, test linearity using flm.test function.

  • "gam": Step-wise gam regression model (uses gam function, mgcv package). Recommended for non-linear models.

Models that use the indicated function of the required package:

  • "svm": Support vector machine (svm function, e1071 package).#'

  • "knn": k-nearest neighbor regression (knnn.reg function, FNN package).#'

  • "lars": Least Angle Regression using Lasso (lars function, lars package).

  • "glmnet": Lasso and Elastic-Net Regularized Generalized Linear Models (glmnet and cv.glmnet function, glmnet package).

  • "rpart": Recursive partitioning for regression a (rpart function, rpart package).

  • "flam": Fit the Fused Lasso Additive Model for a Sequence of Tuning Parameters (flam function, flam package).

  • "novas": NOnparametric VAriable Selection (code available in https://www.math.univ-toulouse.fr/~ferraty/SOFTWARES/NOVAS/novas-routines.R).

  • "cosso": Fit Regularized Nonparametric Regression Models Using COSSO Penalty (cosso function, cosso package).

  • "npreg": kernel regression estimate of a one (1) dimensional dependent variable on p-variate explanatory data (npreg function, np package).

  • "mars": Multivariate adaptive regression splines (mars function, mda package).

  • "nnet": Fit Neural Networks (nnet function, nnet package).

  • "lars": Fits Least Angle Regression, Lasso and Infinitesimal Forward Stagewise regression models (lars function, lars package).

Value

LMDC.select function return a list of two elements:

  • cor the value of distance correlation for each covariate.

  • maxLocal index or locations of local maxima distance correlations.

LMDC.regre function return a list of folowing elements:

  • model object corresponding to the estimated method using the selected variables

  • xvar names of selected variables (impact points).

  • edf Effective Degrees of Freedom.

  • nvarNumber of selected variables (impact points).

Author(s)

Manuel Oviedo de la Fuente manuel.oviedo@usc.es

References

Ordonez, C., Oviedo de la Fuente, M., Roca-Pardinas, J., Rodriguez-Perez, J. R. (2018). Determining optimum wavelengths for leaf water content estimation from reflectance: A distance correlation approach. Chemometrics and Intelligent Laboratory Systems. 173,41-50 https://doi.org/10.1016/j.chemolab.2017.12.001.

See Also

See Also as: lm, gam, dcor.xy.

Examples

## Not run: 
data(tecator)
absorp=fdata.deriv(tecator$absorp.fdata,2)
ind=1:129
x=absorp[ind,]
y=tecator$y$Fat[ind]
newx=absorp[-ind,]
newy=tecator$y$Fat[-ind]

## Functional PC regression
res.pc=fregre.pc(x,y,1:6)
pred.pc=predict(res.pc,newx)

# Functional regression with basis representation
res.basis=fregre.basis.cv(x,y)
pred.basis=predict(res.basis[[1]],newx)

# Functional nonparametric regression
res.np=fregre.np.cv(x,y)
pred.np=predict(res.np,newx)

dat    <- data.frame("y"=y,x$data)
newdat <- data.frame("y"=newy,newx$data)

res.gam=fregre.gsam(y~s(x),data=list("df"=dat,"x"=x))
pred.gam=predict(res.gam,list("x"=newx))

dc.raw <- LMDC.select("y",data=dat, tol = 0.05, pvalue= 0.05,
                      plot=F, smo=T,verbose=F)
covar <- paste("X",dc.raw$maxLocal,sep="")                      
# Preselected design/impact points 
covar
ftest<-flm.test(dat[,-1],dat[,"y"], B=500, verbose=F,
    plot.it=F,type.basis="pc",est.method="pc",p=4,G=50)
    
if (ftest$p.value>0.05) { 
  # Linear relationship, step-wise lm is recommended
  out <- LMDC.regre("y",covar,dat,newdat,pvalue=.05,
              method ="lm",plot=F,verbose=F)
} else {
 # Non-Linear relationship, step-wise gam is recommended
  out <- LMDC.regre("y",covar,dat,newdat,pvalue=.05,
              method ="gam",plot=F,verbose=F) }  
             
# Final  design/impact points
out$xvar

# Predictions
mean((newy-pred.pc)^2)                
mean((newy-pred.basis)^2) 
mean((newy-pred.np)^2)              
mean((newy-pred.gam)^2) 
mean((newy-out$pred)^2)

## End(Not run)

fda.usc

Functional Data Analysis and Utilities for Statistical Computing

v2.0.2
GPL-2
Authors
Manuel Febrero Bande [aut], Manuel Oviedo de la Fuente [aut, cre], Pedro Galeano [ctb], Alicia Nieto [ctb], Eduardo Garcia-Portugues [ctb]
Initial release
2020-02-17

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.