Generate segments for cross-validation
The function generates a list of segments for cross-validation. It can generate random, consecutive and interleaved segments, and supports keeping replicates in the same segment.
cvsegments(N, k, length.seg = ceiling(N / k), nrep = 1, type = c("random", "consecutive", "interleaved"))
N |
Integer. The number of rows in the data set. |
k |
Integer. The number of segments to return. |
length.seg |
Integer. The length of the segments. If
given, it overrides |
nrep |
Integer. The number of (consecutive) rows that are replicates of the same object. Replicates will always be kept in the same segment. |
type |
One of |
If length.seg
is specified, it is used to calculate the number of
segments to generate. Otherwise k
must be specified. If
k*length.seg <> N, the
k*length.seg - N last segments will contain only
length.seg - 1 indices.
If type
is "random"
, the indices are allocated to
segments in random order. If it is "consecutive"
, the first
segment will contain the first length.seg indices, and so on.
If type
is "interleaved"
, the first segment will contain
the indices 1, length.seg+1, 2*lenght.seg+1, …,
(k-1)*length.seg+1, and so on.
If nrep > , it is assumed that each nrep
consecutive rows
are replicates (repeated measurements) of the same object, and care is
taken that replicates are never put in different segments.
Warning: If k
does not divide N
, a specified
length.seg
does not divide N
, or nrep
does not
divide length.seg
, the number of segments and/or the segment
length will be adjusted as needed. Warnings are printed for some of
these cases, and one should always inspect the resulting segments to
make sure they are as expected.
A list of vectors. Each vector contains the indices for one segment.
The attribute "incomplete"
contains the number of incomplete
segments, and the attribute "type"
contains the type of segments.
Bjørn-Helge Mevik and Ron Wehrens
## Segments for 10-fold randomised cross-validation: cvsegments(100, 10) ## Segments with four objects, taken consecutive: cvsegments(60, length.seg = 4, type = "cons") ## Incomplete segments segs <- cvsegments(50, length.seg = 3) attr(segs, "incomplete") ## Leave-one-out cross-validation: cvsegments(100, 100) ## Leave-one-out with variable/unknown data set size n: n <- 50 cvsegments(n, length.seg = 1) ## Data set with replicates cvsegments(100, 25, nrep = 2) ## Note that rows 1 and 2 are in the same segment, rows 3 and 4 in the ## same segment, and so on.
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.