Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

ffdfindexget

Reading and writing ffdf data.frame using ff subscripts


Description

Function ffdfindexget allows to extract rows from an ffdf data.frame according to positive integer suscripts stored in an ff vector.
Function ffdfindexset allows the inverse operation: assigning to rows of an ffdf data.frame according to positive integer suscripts stored in an ff vector. These functions allow more control than the method dispatch of [ and [<- if an ff integer subscript is used.

Usage

ffdfindexget(x, index, indexorder = NULL, autoindexorder = 3, FF_RETURN = NULL
  , BATCHSIZE = NULL, BATCHBYTES = getOption("ffmaxbytes"), VERBOSE = FALSE)
  ffdfindexset(x, index, value, indexorder = NULL, autoindexorder = 3
  , BATCHSIZE = NULL, BATCHBYTES = getOption("ffmaxbytes"), VERBOSE = FALSE)

Arguments

x

A ffdf data.frame containing the elements

index

A ff integer vector with integer subscripts in the range from 1 to length(x).

value

A ffdf data.frame like x with the rows to be assigned

indexorder

Optionally the return value of ffindexorder, see details

autoindexorder

The minimum number of columns (which need chunked indexordering) for which we switch from on-the-fly ordering to stored ffindexorder

FF_RETURN

Optionally an ffdf data.frame of the same type as x in which the returned values shall be stored, see details.

BATCHSIZE

Optinal limit for the batchsize (see details)

BATCHBYTES

Limit for the number of bytes per batch

VERBOSE

Logical scalar for verbosing

Details

Accessing rows of an ffdf data.frame identified by integer positions in an ff vector is a non-trivial task, because it could easily lead to random-access to disk files. We avoid random access by loading batches of the subscript values into RAM, order them ascending, and only then access the ff values on disk. Such ordering is don on-thy-fly for upto autoindexorder-1 columns that need ordering. For autoindexorder o more columns we do the batched ordering upfront with ffindexorder and then re-use it in each call to ffindexget resp. ffindexset.

Value

Function ffdfindexget returns a ffdf data.frame with those rows selected by the ff index vector.
Function ffdfindexset returns x with those rows replaced that had been requested by index and value.

Author(s)

Jens Oehlschlägel

See Also

Examples

message("ff integer subscripts with ffdf return/assign values")
x <- ff(factor(letters))
y <- ff(1:26)
d <- ffdf(x,y)
i <- ff(2:9)
di <- d[i,]
di
d[i,] <- di
message("ff integer subscripts: more control with ffindexget/ffindexset")
di <- ffdfindexget(d, i, FF_RETURN=di)
d <- ffdfindexset(d, i, di)
rm(x, y, d, i, di)
gc()

ff

Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

v4.0.4
GPL-2 | GPL-3 | file LICENSE
Authors
Daniel Adler [aut], Christian Gläser [aut], Oleg Nenadic [aut], Jens Oehlschlägel [aut, cre], Martijn Schuemie [aut], Walter Zucchini [aut]
Initial release
2020-10-13

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.