fda: fRegress – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

fRegress

Functional Regression Analysis

Description

This function carries out a functional regression analysis, where either the dependent variable or one or more independent variables are functional. Non-functional variables may be used on either side of the equation. In a simple problem where there is a single scalar independent covariate with values z_i, i=1,…,N and a single functional covariate with values x_i(t), the two versions of the model fit by fRegress are the scalar dependent variable model

y_i = β_1 z_i + \int x_i(t) β_2(t) \, dt + e_i

and the concurrent functional dependent variable model

y_i(t) = β_1(t) z_i + β_2(t) x_i(t) + e_i(t).

In these models, the final term e_i or e_i(t) is a residual, lack of fit or error term.

In the concurrent functional linear model for a functional dependent variable, all functional variables are all evaluated at a common time or argument value $t$. That is, the fit is defined in terms of the behavior of all variables at a fixed time, or in terms of "now" behavior.

All regression coefficient functions β_j(t) are considered to be functional. In the case of a scalar dependent variable, the regression coefficient for a scalar covariate is converted to a functional variable with a constant basis. All regression coefficient functions can be forced to be smooth through the use of roughness penalties, and consequently are specified in the argument list as functional parameter objects.

Usage

fRegress(y, ...)
## S3 method for class 'fd'
fRegress(y, xfdlist, betalist, wt=NULL,
                     y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, 
                        method=c('fRegress', 'model'), sep='.', ...)
## S3 method for class 'double'
fRegress(y, xfdlist, betalist, wt=NULL,
                     y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, ...)
## S3 method for class 'formula'
fRegress(y, data=NULL, betalist=NULL, wt=NULL,
                 y2cMap=NULL, SigmaE=NULL,
                 method=c('fRegress', 'model'), sep='.', ...)
## S3 method for class 'character'
fRegress(y, data=NULL, betalist=NULL, wt=NULL,
                 y2cMap=NULL, SigmaE=NULL,
                 method=c('fRegress', 'model'), sep='.', ...)

Arguments

`y`	the dependent variable object. It may be an object of five possible classes or attributes: character or formula a `formula` object or a `character` object that can be coerced into a `formula` providing a symbolic description of the model to be fitted satisfying the following rules: The left hand side, `formula` `y`, must be either a numeric vector or a univariate object of class `fd`. All objects named on the right hand side must be either `numeric` or `fd` (functional data). The number of replications of `fd` object(s) must match each other and the number of observations of `numeric` objects named, as well as the number of replications of the dependent variable object. The right hand side of this `formula` is translated into `xfdlist`, then passed to another method for fitting (unless `method` = 'model'). Multivariate independent variables are allowed in a `formula` and are split into univariate independent variables in the resulting `xfdlist`. Similarly, categorical independent variables with `k` levels are translated into `k-1` contrasts in `xfdlist`. Any smoothing information is passed to the corresponding component of `betalist`. numeric a numeric vector object or a matrix object if the dependent variable is numeric or a matrix. fd a functional data object or an fdPar object if the dependent variable is functional.
`data`	an optional `list` or `data.frame` containing names of objects identified in the `formula` or `character` `y`.
`xfdlist`	a list of length equal to the number of independent variables (including any intercept). Members of this list are the independent variables. They can be objects of either of these two classes: scalar a numeric vector if the independent variable is scalar. fd a (univariate) functional data object. In either case, the object must have the same number of replications as the dependent variable object. That is, if it is a scalar, it must be of the same length as the dependent variable, and if it is functional, it must have the same number of replications as the dependent variable. (Only univariate independent variables are currently allowed in `xfdlist`.)
`betalist`	For the `fd`, `fdPar`, and `numeric` methods, `betalist` must be a list of length equal to `length(xfdlist)`. Members of this list are functional parameter objects (class `fdPar`) defining the regression functions to be estimated. Even if a corresponding independent variable is scalar, its regression coefficient must be functional if the dependent variable is functional. (If the dependent variable is a scalar, the coefficients of scalar independent variables, including the intercept, must be constants, but the coefficients of functional independent variables must be functional.) Each of these functional parameter objects defines a single functional data object, that is, with only one replication. For the `formula` and `character` methods, `betalist` can be either a `list`, as for the other methods, or `NULL`, in which case a list is created. If `betalist` is created, it will use the bases from the corresponding component of `xfdlist` if it is function or from the response variable. Smoothing information (arguments `Lfdobj`, `lambda`, `estimate`, and `penmat` of function `fdPar`) will come from the corresponding component of `xfdlist` if it is of class `fdPar` (or for scalar independent variables from the response variable if it is of class `fdPar`) or from optional `...` arguments if the reference variable is not of class `fdPar`.
`wt`	weights for weighted least squares
`y2cMap`	the matrix mapping from the vector of observed values to the coefficients for the dependent variable. This is output by function `smooth.basis`. If this is supplied, confidence limits are computed, otherwise not.
`SigmaE`	Estimate of the covariances among the residuals. This can only be estimated after a preliminary analysis with `fRegress`.
`method`	a character string matching either `fRegress` for functional regression estimation or `mode` to create the argument lists for functional regression estimation without running it.
`sep`	separator for creating names for multiple variables for `fRegress.fdPar` or `fRegress.numeric` created from single variables on the right hand side of the `formula` `y`. This happens with multidimensional `fd` objects as well as with categorical variables.
`returnMatrix`	logical: If TRUE, a two-dimensional is returned using a special class from the Matrix package.
`...`	optional arguments

Details

Alternative forms of functional regression can be categorized with traditional least squares using the following 2 x 2 table:

		explanatory	variable
response	\|	scalar	\|	function
	\|		\|
scalar	\|	lm	\|	fRegress.numeric
	\|		\|
function	\|	fRegress.fd or	\|	fRegress.fd or
	\|	fRegress.fdPar	\|	fRegress.fdPar or linmod

For fRegress.numeric, the numeric response is assumed to be the sum of integrals of xfd * beta for all functional xfd terms.

fRegress.fd or .fdPar produces a concurrent regression with each beta being also a (univariate) function.

linmod predicts a functional response from a convolution integral, estimating a bivariate regression function.

In the computation of regression function estimates in fRegress, all independent variables are treated as if they are functional. If argument xfdlist contains one or more vectors, these are converted to functional data objects having the constant basis with coefficients equal to the elements of the vector.

Needless to say, if all the variables in the model are scalar, do NOT use this function. Instead, use either lm or lsfit.

These functions provide a partial implementation of Ramsay and Silverman (2005, chapters 12-20).

Value

These functions return either a standard fRegress fit object or or a model specification:

The fRegress fit object case:

A list of class fRegress with the following components:

y: The first argument in the call to fRegress. This argument is coerced to class fd in fda version 5.1.9. Prior versions of the package converted it to an fdPar, but the extra structures in that class were not used in any of the fRegress codes.
xfdlist: The second argument in the call to fRegress.
betalist: The third argument in the call to fRegress.
betaestlist: A list of length equal to the number of independent variables and with members having the same functional parameter structure as the corresponding members of betalist. These are the estimated regression coefficient functions.
yhatfdobj: A functional parameter object (class fdPar) if the dependent variable is functional or a vector if the dependent variable is scalar. This is the set of predicted by the functional regression model for the dependent variable.
Cmatinv: A matrix containing the inverse of the coefficient matrix for the linear equations that define the solution to the regression problem. This matrix is required for function fRegress.stderr that estimates confidence regions for the regression coefficient function estimates.
wt: The vector of weights input or inferred.

If class(y) is numeric, the fRegress object also includes:
df: The equivalent degrees of freedom for the fit.
OCV the leave-one-out cross validation score for the model.
gcv: The generalized cross validation score.

If class(y) is fd or fdPar, the fRegress object returned also includes 5 other components:
y2cMap: An input y2cMap.
SigmaE: An input SigmaE.
betastderrlist: An fd object estimating the standard errors of betaestlist.
bvar: A covariance matrix for regression coefficient estimates.
c2bMap: A mapping matrix that maps variation in Cmat to variation in regression coefficients.

The model specification object case:

The fRegress.formula and fRegress.character functions translate the formula into the argument list required by fRegress.fdPar or fRegress.numeric. With the default value 'fRegress' for the argument method, this list is then used to call the appropriate other fRegress function.

Alternatively, to see how the formula is translated, use the alternative 'model' value for the argument method. In that case, the function returns a list with the arguments otherwise passed to these other functions plus the following additional components:

xfdlist0: A list of the objects named on the right hand side of formula. This will differ from xfdlist for any categorical or multivariate right hand side object.
type: the type component of any fd object on the right hand side of formula.
nbasis: A vector containing the nbasis components of variables named in formula having such components.
xVars: An integer vector with all the variable names on the right hand side of formula containing the corresponding number of variables in xfdlist. This can exceed 1 for any multivariate object on the right hand side of class either numeric or fd as well as any categorical variable.

Author(s)

J. O. Ramsay, Giles Hooker, and Spencer Graves

References

Ramsay, James O., Hooker, Giles, and Graves, Spencer (2009) Functional Data Analysis in R and Matlab, Springer, New York.

Ramsay, James O., and Silverman, Bernard W. (2005), Functional Data Analysis, 2nd ed., Springer, New York.

Examples

###
###
###  ---------  vector response with functional explanatory variable  ---------
###
###

#  data are in Canadian Weather object
#  print the names of the data
print(names(CanadianWeather))
#  set up log10 of annual precipitation for 35 weather stations
annualprec <- log10(apply(CanadianWeather$dailyAv[,,"Precipitation.mm"], 2,sum))
# The simplest 'fRegress' call is singular with more bases
# than observations, so we use only 25 basis functions, for this example
smallbasis  <- create.fourier.basis(c(0, 365), 25)
# The covariate is the temperature curve for each station.
tempfd <- smooth.basis(day.5,
          CanadianWeather$dailyAv[,,"Temperature.C"], smallbasis)$fd

##
## formula interface:  specify the model by a formula, the method
## fRegress.formula automatically sets up the regression coefficient functions,
## a constant function for the intercept, and a higher dimensional function
## for the inner product with temperature
##

precip.Temp1 <- fRegress(annualprec ~ tempfd)
#  the output is a list with class name fRegress, display names
names(precip.Temp1)
#  the vector of fits to the data is object  precip.Temp1$yfdPar,
#  but since the dependent variable is a vector, so is the fit
annualprec.fit1 <- precip.Temp1$yhatfdobj
#  plot the data and the fit
plot(annualprec.fit1, annualprec, type="p", pch="o")
lines(annualprec.fit1, annualprec.fit1, lty=2)
#  print root mean squared error
RMSE <- sqrt(mean((annualprec-annualprec.fit1)^2))
print(paste("RMSE =",RMSE))
#  plot the estimated regression function
plot(precip.Temp1$betaestlist[[2]])
#  This isn't helpful either, the coefficient function is too
#  complicated to interpret.
#  display the number of basis functions used:
print(precip.Temp1$betaestlist[[2]]$fd$basis$nbasis)
#  25 basis functions to fit 35 values, no wonder we over-fit the data

##
## Get the default setup and modify it
## the "model" value of the method argument causes the analysis
## to produce a list vector of arguments for calling the
## fRegress function
##

precip.Temp.mdl1 <- fRegress(annualprec ~ tempfd, method="model")
# First confirm we get the same answer as above by calling
# function fRegress() with these arguments:
precip.Temp.m <- do.call('fRegress', precip.Temp.mdl1)

all.equal(precip.Temp.m, precip.Temp1)


#  set up a smaller basis for beta2 than for temperature so that we
#  get a more parsimonious fit to the data

nbetabasis2 <- 21  #  not much less, but we add some roughness penalization
betabasis2  <- create.fourier.basis(c(0, 365), nbetabasis2)
betafd2     <- fd(rep(0, nbetabasis2), betabasis2)
# add smoothing
betafdPar2  <- fdPar(betafd2, lambda=10)

# replace the regress coefficient function with this fdPar object
precip.Temp.mdl2 <- precip.Temp.mdl1
precip.Temp.mdl2[['betalist']][['tempfd']] <- betafdPar2

# Now do re-fit the data
precip.Temp2 <- do.call('fRegress', precip.Temp.mdl2)

# Compare the two fits:
#  degrees of freedom
precip.Temp1[['df']] # 26
precip.Temp2[['df']]  # 8
#  root-mean-squared errors:
RMSE1 <- sqrt(mean(with(precip.Temp1, (yhatfdobj-yfdobj)^2)))
RMSE2 <- sqrt(mean(with(precip.Temp2, (yhatfdobj-yfdobj)^2)))
#  display further results for the more parsimonious model
annualprec.fit2 <- precip.Temp2$yhatfdobj
plot(annualprec.fit2, annualprec, type="p", pch="o")
lines(annualprec.fit2, annualprec.fit2, lty=2)
#  plot the estimated regression function
plot(precip.Temp2$betaestlist[[2]])
#  now we see that it is primarily the temperatures in the
#  early winter that provide the fit to log precipitation by temperature

##
## Manual construction of xfdlist and betalist
##

xfdlist <- list(const=rep(1, 35), tempfd=tempfd)

# The intercept must be constant for a scalar response
betabasis1 <- create.constant.basis(c(0, 365))
betafd1    <- fd(0, betabasis1)
betafdPar1 <- fdPar(betafd1)

betafd2     <- create.bspline.basis(c(0, 365),7)
# convert to an fdPar object
betafdPar2  <- fdPar(betafd2)

betalist <- list(const=betafdPar1, tempfd=betafdPar2)

precip.Temp3   <- fRegress(annualprec, xfdlist, betalist)
annualprec.fit3 <- precip.Temp3$yhatfdobj
#  plot the data and the fit
plot(annualprec.fit3, annualprec, type="p", pch="o")
lines(annualprec.fit3, annualprec.fit3)
plot(precip.Temp3$betaestlist[[2]])

###
###
### --------  functional response with vector explanatory variables  ----------
###
###

##
## simplest:  formula interface
##

daybasis65 <- create.fourier.basis(rangeval=c(0, 365), nbasis=65,
                  axes=list('axesIntervals'))
Temp.fd <- with(CanadianWeather, smooth.basisPar(day.5,
                dailyAv[,,'Temperature.C'], daybasis65)$fd)
TempRgn.f <- fRegress(Temp.fd ~ region, CanadianWeather)

##
## Get the default setup and possibly modify it
##

TempRgn.mdl <- fRegress(Temp.fd ~ region, CanadianWeather, method='m')

# make desired modifications here
# then run

TempRgn.m <- do.call('fRegress', TempRgn.mdl)

# no change, so match the first run

all.equal(TempRgn.m, TempRgn.f)


##
## More detailed set up
##

region.contrasts <- model.matrix(~factor(CanadianWeather$region))
rgnContr3 <- region.contrasts
dim(rgnContr3) <- c(1, 35, 4)
dimnames(rgnContr3) <- list('', CanadianWeather$place, c('const',
   paste('region', c('Atlantic', 'Continental', 'Pacific'), sep='.')) )

const365 <- create.constant.basis(c(0, 365))
region.fd.Atlantic <- fd(matrix(rgnContr3[,,2], 1), const365)
# str(region.fd.Atlantic)
region.fd.Continental <- fd(matrix(rgnContr3[,,3], 1), const365)
region.fd.Pacific <- fd(matrix(rgnContr3[,,4], 1), const365)
region.fdlist <- list(const=rep(1, 35),
     region.Atlantic=region.fd.Atlantic,
     region.Continental=region.fd.Continental,
     region.Pacific=region.fd.Pacific)
# str(TempRgn.mdl$betalist)

beta1 <- with(Temp.fd, fd(basisobj=basis, fdnames=fdnames))
beta0 <- fdPar(beta1)
betalist <- list(const=beta0, region.Atlantic=beta0,
             region.Continental=beta0, region.Pacific=beta0)

TempRgn <- fRegress(Temp.fd, region.fdlist, betalist)


all.equal(TempRgn, TempRgn.f)


###
###
### --------  functional response with functional explanatory variable  -------
###
###

##
##  predict knee angle from hip angle;  from demo('gait', package='fda')

##
## formula interface
##
gaittime  <- as.matrix((1:20)/21)
gaitrange <- c(0,20)
gaitbasis <- create.fourier.basis(gaitrange, nbasis=21)
harmaccelLfd <- vec2Lfd(c(0, (2*pi/20)^2, 0), rangeval=gaitrange)
gaitfd <- smooth.basisPar(gaittime, gait, gaitbasis, 
                          Lfdobj=harmaccelLfd, lambda=1e-2)$fd
hipfd  <- gaitfd[,1]
kneefd <- gaitfd[,2]

knee.hip.f <- fRegress(kneefd ~ hipfd)

##
## manual set-up
##

#  set up the list of covariate objects
const  <- rep(1, dim(kneefd$coef)[2])
xfdlist  <- list(const=const, hipfd=hipfd)

beta0 <- with(kneefd, fd(basisobj=basis, fdnames=fdnames))
beta1 <- with(hipfd, fd(basisobj=basis, fdnames=fdnames))

betalist  <- list(const=fdPar(beta0), hipfd=fdPar(beta1))

fRegressout <- fRegress(kneefd, xfdlist, betalist)


all.equal(fRegressout, knee.hip.f)

fda

Functional Data Analysis

v5.1.9

GPL (>= 2)

Authors

J. O. Ramsay <ramsay@psych.mcgill.ca> [aut,cre], Spencer Graves <spencer.graves@effectivedefense.org> [ctb], Giles Hooker <gjh27@cornell.edu> [ctb]

Initial release

2020-12-16