sm: hcv – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

hcv

Cross-validatory choice of smoothing parameter

Description

This function uses the technique of cross-validation to select a smoothing parameter suitable for constructing a density estimate or nonparametric regression curve in one or two dimensions.

Usage

hcv(x, y = NA, hstart = NA, hend = NA, ...)

Arguments

`x`	a vector, or two-column matrix of data. If `y` is missing these are observations to be used in the construction of a density estimate. If `y` is present, these are the covariate values for a nonparametric regression.
`y`	a vector of response values for nonparametric regression.
`hstart`	the smallest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.
`hend`	the largest value of the grid points to be used in an initial grid search for the value of the smoothing parameter.
`...`	other optional parameters are passed to the `sm.options` function, through a mechanism which limits their effect only to this call of the function. Those specifically relevant for this function are the following: `h.weights`, `ngrid`, `display`, `add`; see the documentation of `sm.options` for their description.

Details

See Sections 2.4 and 4.5 of the reference below.

The two-dimensional case uses a smoothing parameter derived from a single value, scaled by the standard deviation of each component.

This function does not employ a sophisticated algorithm and some adjustment of the search parameters may be required for different sets of data. An initial estimate of the value of h which minimises the cross-validatory criterion is located from a grid search using values which are equally spaced on a log scale between hstart and hend. A quadratic approximation is then used to refine this initial estimate.

Value

the value of the smoothing parameter which minimises the cross-validation criterion over the selected grid.

Side Effects

If the minimising value is located at the end of the grid of search positions, or if some values of the cross-validatory criterion cannot be evaluated, then a warning message is printed. In these circumstances altering the values of hstart and hend may improve performance.

Note

As from version 2.1 of the package, a similar effect can be obtained with the new function h.select, via h.select(x, method="cv"). Users are encouraged to adopt this route, since hcv might be not accessible directly in future releases of the package. When the sample size is large hcv uses the raw data while h.select(x, method="cv") uses binning. The latter is likely to produce a more stable choice for h.

References

Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

Examples

#  Density estimation

x <- rnorm(50)
par(mfrow=c(1,2))
h.cv <- hcv(x, display="lines", ngrid=32)
sm.density(x, h=hcv(x))
par(mfrow=c(1,1))

#  Nonparametric regression

x <- seq(0, 1, length = 50)
y <- rnorm(50, sin(2 * pi * x), 0.2)
par(mfrow=c(1,2))
h.cv <- hcv(x, y, display="lines", ngrid=32)
sm.regression(x, y, h=hcv(x, y))
par(mfrow=c(1,1))

sm

Smoothing Methods for Nonparametric Regression and Density Estimation

v2.2-5.6

GPL (>= 2)

Authors

Adrian Bowman and Adelchi Azzalini. Ported to R by B. D. Ripley <ripley@stats.ox.ac.uk> up to version 2.0, version 2.1 by Adrian Bowman and Adelchi Azzalini, version 2.2 by Adrian Bowman.

Initial release

2018-09-27

hcv