Goodness-of-fit test for the Functional Linear Model with scalar response
The function flm.test
tests the composite null hypothesis of
a Functional Linear Model with scalar response (FLM),
H_0: Y=<X,β>+ε,
versus a general alternative. If β=β_0 is provided, then the simple hypothesis H_0: Y=<X,β_0>+ε is tested. The testing of the null hypothesis is done by a Projected Cramer-von Mises statistic (see Details).
flm.test( X.fdata, Y, beta0.fdata = NULL, B = 5000, est.method = "pls", p = NULL, type.basis = "bspline", verbose = TRUE, plot.it = TRUE, B.plot = 100, G = 200, ... )
X.fdata |
Functional covariate for the FLM. The object must be in the class
|
Y |
Scalar response for the FLM. Must be a vector with the same number of elements
as functions are in |
beta0.fdata |
Functional parameter for the simple null hypothesis, in the |
B |
Number of bootstrap replicates to calibrate the distribution of the test statistic.
|
est.method |
Estimation method for the unknown parameter β,
only used in the composite case. Mainly, there are two options: specify the number of basis
elements for the estimated β by
|
p |
Number of elements of the basis considered. If it is not given, an optimal |
type.basis |
Type of basis used to represent the functional process. Depending on the hypothesis it will have a different interpretation:
|
verbose |
Either to show or not information about computing progress. |
plot.it |
Either to show or not a graph of the observed trajectory,
and the bootstrap trajectories under the null composite hypothesis, of the
process R_n(.) (see Details). Note that if |
B.plot |
Number of bootstrap trajectories to show in the resulting plot of the test.
As the trajectories shown are the first |
G |
Number of projections used to compute the trajectories of the process R_n(.) by Monte Carlo. |
... |
Further arguments passed to |
The Functional Linear Model with scalar response (FLM), is defined as
Y=<X,β>+ε, for a functional process
X such that E[X(t)]=0, E[X(t)ε]=0
for all t and for a scalar variable Y such that E[Y]=0.
Then, the test assumes that Y
and X.fdata
are centred and will automatically
center them. So, bear in mind that when you apply the test for Y
and X.fdata
,
actually, you are applying it to Y-mean(Y)
and fdata.cen(X.fdata)$Xcen
.
The test statistic corresponds to the Cramer-von Mises norm of the Residual Marked
empirical Process based on Projections R_n(u,γ) defined in
Garcia-Portugues et al. (2014).
The expression of this process in a p-truncated basis of the space L^2[0,T]
leads to the p-multivariate process R_{n,p}(u,γ^{(p)}),
whose Cramer-von Mises norm is computed.
The choice of an appropriate p to represent the functional process X,
in case that is not provided, is done via the estimation of β for the composite
hypothesis. For the simple hypothesis, as no estimation of β is done, the choice
of p depends only on the functional process X. As the result of the test may
change for different p's, we recommend to use an automatic criterion to select p
instead of provide a fixed one.
The distribution of the test statistic is approximated by a wild bootstrap resampling on the
residuals, using the golden section bootstrap.
Finally, the graph shown if plot.it=TRUE
represents the observed trajectory, and the
bootstrap trajectories under the null, of the process RMPP integrated on the projections:
R_n(u) \approx \frac{1}{G} ∑_{g=1}^G R_n(u,γ_g),
where γ_g are simulated as Gaussians processes. This gives a graphical idea of how distant is the observed trajectory from the null hypothesis.
An object with class "htest"
whose underlying structure is a list containing
the following components:
statistic The value of the test statistic.
boot.statistics A vector of length B
with the values of the bootstrap test statistics.
p.value The p-value of the test.
method The method used.
B The number of bootstrap replicates used.
type.basis The type of basis used.
beta.est The estimated functional parameter β in the composite
hypothesis. For the simple hypothesis, the given beta0.fdata
.
p The number of basis elements passed or automatically chosen.
ord The optimal order for PC and PLS given by fregre.pc.cv
and fregre.pls.cv
. For other methods is setted to 1:p
.
data.name The character string "Y=<X,b>+e"
No NA's are allowed neither in the functional covariate nor in the scalar response.
Eduardo Garcia-Portugues. Please, report bugs and suggestions to edgarcia@est-econ.uc3m.es
Escanciano, J. C. (2006). A consistent diagnostic test for regression models using projections. Econometric Theory, 22, 1030-1051. http://dx.doi.org/10.1017/S0266466606060506
Garcia-Portugues, E., Gonzalez-Manteiga, W. and Febrero-Bande, M. (2014). A goodness–of–fit test for the functional linear model with scalar response. Journal of Computational and Graphical Statistics, 23(3), 761-778. http://dx.doi.org/10.1080/10618600.2013.812519
# Simulated example # X=rproc2fdata(n=100,t=seq(0,1,l=101),sigma="OU") beta0=fdata(mdata=cos(2*pi*seq(0,1,l=101))-(seq(0,1,l=101)-0.5)^2+ rnorm(101,sd=0.05),argvals=seq(0,1,l=101),rangeval=c(0,1)) Y=inprod.fdata(X,beta0)+rnorm(100,sd=0.1) dev.new(width=21,height=7) par(mfrow=c(1,3)) plot(X,main="X") plot(beta0,main="beta0") plot(density(Y),main="Density of Y",xlab="Y",ylab="Density") rug(Y) ## Not run: # Composite hypothesis: do not reject FLM pcvm.sim=flm.test(X,Y,B=50,B.plot=50,G=100,plot.it=TRUE) pcvm.sim flm.test(X,Y,B=5000) # Estimated beta dev.new() plot(pcvm.sim$beta.est) # Simple hypothesis: do not reject beta=beta0 flm.test(X,Y,beta0.fdata=beta0,B=50,B.plot=50,G=100) flm.test(X,Y,beta0.fdata=beta0,B=5000) # AEMET dataset # data(aemet) # Remove the 5\ dev.new() res.FM=depth.FM(aemet$temp,draw=TRUE) qu=quantile(res.FM$dep,prob=0.05) l=which(res.FM$dep<=qu) lines(aemet$temp[l],col=3) aemet$df$name[l] # Data without outliers wind.speed=apply(aemet$wind.speed$data,1,mean)[-l] temp=aemet$temp[-l] # Exploratory analysis: accept the FLM pcvm.aemet=flm.test(temp,wind.speed,est.method="pls",B=100,B.plot=50,G=100) pcvm.aemet # Estimated beta dev.new() plot(pcvm.aemet$beta.est,lwd=2,col=2) # B=5000 for more precision on calibration of the test: also accept the FLM flm.test(temp,wind.speed,est.method="pls",B=5000) # Simple hypothesis: rejection of beta0=0? Limiting p-value... dat=rep(0,length(temp$argvals)) flm.test(temp,wind.speed, beta0.fdata=fdata(mdata=dat,argvals=temp$argvals, rangeval=temp$rangeval),B=100) flm.test(temp,wind.speed, beta0.fdata=fdata(mdata=dat,argvals=temp$argvals, rangeval=temp$rangeval),B=5000) # Tecator dataset # data(tecator) names(tecator) absorp=tecator$absorp.fdata ind=1:129 # or ind=1:215 x=absorp[ind,] y=tecator$y$Fat[ind] tt=absorp[["argvals"]] # Exploratory analysis for composite hypothesis with automatic choose of p pcvm.tecat=flm.test(x,y,B=100,B.plot=50,G=100) pcvm.tecat # B=5000 for more precision on calibration of the test: also reject the FLM flm.test(x,y,B=5000) # Distribution of the PCvM statistic plot(density(pcvm.tecat$boot.statistics),lwd=2,xlim=c(0,10), main="PCvM distribution", xlab="PCvM*",ylab="Density") rug(pcvm.tecat$boot.statistics) abline(v=pcvm.tecat$statistic,col=2,lwd=2) legend("top",legend=c("PCvM observed"),lwd=2,col=2) # Simple hypothesis: fixed p dat=rep(0,length(x$argvals)) flm.test(x,y,beta0.fdata=fdata(mdata=dat,argvals=x$argvals, rangeval=x$rangeval),B=100,p=11) # Simple hypothesis, automatic choose of p flm.test(x,y,beta0.fdata=fdata(mdata=dat,argvals=x$argvals, rangeval=x$rangeval),B=100) flm.test(x,y,beta0.fdata=fdata(mdata=dat,argvals=x$argvals, rangeval=x$rangeval),B=5000) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.