Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

ffsort

Sorting of ff vectors


Description

Sorting: sort an ff vector – optionally in-place

Usage

ffsort(x
, aux = NULL
, has.na = TRUE
, na.last = TRUE
, decreasing = FALSE
, inplace = FALSE
, decorate = FALSE
, BATCHBYTES = getOption("ffmaxbytes")
, VERBOSE = FALSE
)

Arguments

x

an ff vector

aux

NULL or an ff vector of the same type for temporary storage

has.na

boolean scalar telling ffsort whether the vector might contain NAs. Note that you risk a crash if there are unexpected NAs with has.na=FALSE

na.last

boolean scalar telling ffsort whether to sort NAs last or first. Note that 'boolean' means that there is no third option NA as in sort

decreasing

boolean scalar telling ffsort whether to sort increasing or decreasing

inplace

boolean scalar telling ffsort whether to sort the original ff vector (TRUE) or to create a sorted copy (FALSE, the default)

decorate

boolean scalar telling ffsort whether to decorate the returned ff vector with is.sorted and na.count attributes.

BATCHBYTES

maximum number of RAM bytes ffsort should try not to exceed

VERBOSE

cat some info about the sorting

Details

ffsort tries to sort the vector in-RAM respecting the BATCHBYTES limit. If a fast sort it not possible, it uses a slower in-place sort (shellsort). If in-RAM is not possible, it uses (a yet simple) out-of-memory algorithm. Like ramsort the in-RAM sorting method is choosen depending on context information. If a key-index sort can be used, ffsort completely avoids merging disk based subsorts. If argument decorate=TRUE is used, then na.count(x) will return the number of NAs and is.sorted(x) will return TRUE if the sort was done with na.last=TRUE and decreasing=FALSE.

Value

An ff vector – optionally decorated with is.sorted and na.count, see argument 'decorate'

Note

the ff vector may not have a names attribute

Author(s)

Jens Oehlschlägel

See Also

Examples

n <- 1e6
   x <- ff(c(NA, 999999:1), vmode="double", length=n)
   x <- ffsort(x)
   x
   is.sorted(x)
   na.count(x)
   x <- ffsort(x, decorate=TRUE)
   is.sorted(x)
   na.count(x)
   x <- ffsort(x, BATCHBYTES=n, VERBOSE=TRUE)

ff

Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

v4.0.4
GPL-2 | GPL-3 | file LICENSE
Authors
Daniel Adler [aut], Christian Gläser [aut], Oleg Nenadic [aut], Jens Oehlschlägel [aut, cre], Martijn Schuemie [aut], Walter Zucchini [aut]
Initial release
2020-10-13

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.