Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

ffindexorder

Sorting: chunked ordering of integer suscript positions


Description

Function ffindexorder will calculate chunkwise the order positions to sort all positions in a chunk ascending.
Function ffindexordersize does the calculation of the chunksize for ffindexorder.

Usage

ffindexordersize(length, vmode, BATCHBYTES = getOption("ffmaxbytes"))
ffindexorder(index, BATCHSIZE, FF_RETURN = NULL, VERBOSE = FALSE)

Arguments

index

A ff integer vector with integer subscripts.

BATCHSIZE

Limit for the chunksize (see details)

BATCHBYTES

Limit for the number of bytes per batch

FF_RETURN

Optionally an ff integer vector in which the chunkwise order positions are stored.

VERBOSE

Logical scalar for activating verbosing.

length

Number of elements in the index

vmode

The vmode of the ff vector to which the index shall be applied with ffindexget or ffindexset

Details

Accessing integer positions in an ff vector is a non-trivial task, because it could easily lead to random-access to a disk file. We avoid random access by loading batches of the subscript values into RAM, order them ascending, and only then access the ff values on disk. Such an ordering can be done on-the-fly by ffindexget or it can be created upfront with ffindexorder, stored and re-used, similar to storing and using hybrid index information with as.hi.

Value

Function ffindexorder returns an ff integer vector with an attribute BATCHSIZE (the chunksize finally used, not the one given with argument BATCHSIZE).
Function ffindexordersize returns a balanced batchsize as returned from bbatch.

Author(s)

Jens Oehlschlägel

See Also

Examples

x <- ff(sample(40))
     message("fforder requires sorting")
     i <- fforder(x)
     message("applying this order i is done by ffindexget")
     x[i]
     message("applying this order i requires random access, 
       therefore ffindexget does chunkwise sorting")
     ffindexget(x, i)
     message("if we want to apply the order i multiple times,
       we can do the chunkwise sorting once and store it")
     s <- ffindexordersize(length(i), vmode(i), BATCHBYTES = 100)
     o <- ffindexorder(i, s$b)
     message("this is how the stored chunkwise sorting is used")
     ffindexget(x, i, o)
     message("")
     rm(x,i,s,o)
     gc()

ff

Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

v4.0.4
GPL-2 | GPL-3 | file LICENSE
Authors
Daniel Adler [aut], Christian Gläser [aut], Oleg Nenadic [aut], Jens Oehlschlägel [aut, cre], Martijn Schuemie [aut], Walter Zucchini [aut]
Initial release
2020-10-13

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.