Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

CVHTF

K-fold Cross-Validation


Description

K-fold cross-validation.

Usage

CVHTF(X, y, K = 10, REP = 1, family = gaussian, ...)

Arguments

X

training inputs

y

training output

K

size of validation sample

REP

number of replications

family

glm family

...

optional arguments passed to glm or lm

Details

HTF (2009) describe K-fold cross-validation. The observations are partitioned into K non-overlapping subsets of approximately equal size. Each subset is used as the validation sample while the remaining K-1 subsets are used as training data. When K=n, where n is the number of observations the algorithm is equivalent to leave-one-out CV. Normally K=10 or K=5 are used. When K<n-1, their are may be many possible partitions and so the results of K-fold CV may vary somewhat depending on the partitions used. In our implementation, random partitions are used and we allow for many replications. Note that in the Shao's delete-d method, random samples are used to select the valiation data whereas in this method the whole partition is selected as random. This is acomplished using, fold <- sample(rep(1:K,length=n)). Then fold indicates each validation sample in the partition.

Value

Vector of two components comprising the cross-validation MSE and its sd based on the MSE in each validation sample.

Author(s)

A.I. McLeod and C. Xu

References

Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning. 2nd Ed. Springer-Verlag.

See Also

Examples

#Example 1. 10-fold CV
data(zprostate)
train<-(zprostate[zprostate[,10],])[,-10]
X<-train[,1:2]
y<-train[,9]
CVHTF(X,y,K=10,REP=1)[1]

bestglm

Best Subset GLM and Regression Utilities

v0.37.3
GPL (>= 2)
Authors
A.I. McLeod, Changjiang Xu and Yuanhao Lai
Initial release
2020-03-13

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.