rhdf5: h5createDataset – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

rhdf5

h5createDataset

Create HDF5 dataset

Description

R function to create an HDF5 dataset and defining its dimensionality and compression behaviour.

Usage

h5createDataset (file, dataset, 
		dims, maxdims = dims, 
		storage.mode = "double", H5type = NULL, 
		size = NULL, chunk = dims, fillValue, 
		level = 6, filter = "gzip", shuffle = TRUE,
        native = FALSE)

Arguments

`file`	The filename (character) of the file in which the dataset will be located. For advanced programmers it is possible to provide an object of class `H5IdComponent` representing a H5 location identifier (file or group). See `H5Fcreate`, `H5Fopen`, `H5Gcreate`, `H5Gopen` to create an object of this kind.
`dataset`	Name of the dataset to be created. The name can contain group names, e.g. 'group/dataset', but the function will fail, if the group does not yet exist.
`dims`	The dimensions of the array as they will appear in the file. Note, the dimensions will appear in inverted order when viewing the file with a C-programm (e.g. HDFView), because the fastest changing dimension in R is the first one, whereas the fastest changing dimension in C is the last one.
`maxdims`	The maximum extension of the array. Use `H5Sunlimited()` to indicate an extensible dimension.
`storage.mode`	The storage mode of the data to be written. Can be obtained by `storage.mode(mydata)`.
`H5type`	Advanced programmers can specify the datatype of the dataset within the file. See `h5const("H5T")` for a list of available datatypes. If `H5type` is specified the argument `storage.mode` is ignored. It is recommended to use `storage.mode`
`size`	For `storage.mode='character'` the maximum string length has to be specified. rhdf5 writes null padded strings by dafault, thus the value provided here should be the length of the longest string. HDF5 then stores the string as fixed length character vectors. Together with compression, this should be efficient.
`chunk`	The chunk size used to store the dataset. It is an integer vector of the same length as `dims`. This argument is usually set together with a compression property (argument `level`).
`fillValue`	Standard value for filling the dataset. The storage.mode of value has to be convertable to the dataset type by HDF5.
`level`	The compression level used. An integer value between 0 (no compression) and 9 (highest and slowest compression).
`filter`	Character defining which compression filter should be applied to the chunks of the dataset. See the Details section for more information on the options that can be provided here.
`shuffle`	Logical defining whether the byte-shuffle algorithm should be applied to data prior to compression.
`native`	An object of class `logical`. If TRUE, array-like objects are treated as stored in HDF5 row-major rather than R column-major orientation. Using `native = TRUE` increases HDF5 file portability between programming languages. A file written with `native = TRUE` should also be read with `native = TRUE`

Details

Creates a new dataset in an existing HDF5 file. The function will fail if the file doesn't exist or if there exists already another dataset with the same name within the specified file.

The filter argument can take several options matching to compression filters distributed in either with the HDF5 library in Rhdf5lib or via the rhdf5filters package. The plugins available and the corresponding values for selecting them are shown below:

zlib: Ubiquitous deflate compression algrithm used in GZIP or ZIP files. All three options below achieve the same result.

"GZIP",
"ZLIB",
"DEFLATE"

szip: Compression algorithm maintained by the HDF5 group.

"SZIP"

bzip2

"BZIP2"

BLOSC meta compressor: As a meta-compressor BLOSC wraps several different compression algorithms. Each of the options below will active a different compression filter.

"BLOSC_BLOSCLZ"
"BLOSC_LZ4"
"BLOSC_LZ4HC"
"BLOSC_SNAPPY"
"BLOSC_ZLIB"
"BLOSC_ZSTD"

Disable: It is possible to write chunks without and compression applied.

"NONE"

Value

Returns TRUE is dataset was created successfully and FALSE otherwise.

Author(s)

Bernd Fischer, Mike L. Smith

References

https://portal.hdfgroup.org/display/HDF5

Examples

h5createFile("ex_createDataset.h5")

# create dataset with compression
h5createDataset("ex_createDataset.h5", "A", c(5,8), storage.mode = "integer", chunk=c(5,1), level=6)

# create dataset without compression
h5createDataset("ex_createDataset.h5", "B", c(5,8), storage.mode = "integer")
h5createDataset("ex_createDataset.h5", "C", c(5,8), storage.mode = "double")

# create a dataset of strings & define size based on longest string
ex_strings <- c('long', 'longer', 'longest')
h5createDataset("ex_createDataset.h5", "D",  
    storage.mode = "character", chunk = 3, level = 6,
    dims = length(ex_strings), size = max(nchar(ex_strings)))


# write data to dataset
h5write(matrix(1:40,nr=5,nc=8), file="ex_createDataset.h5", name="A")
# write second column
h5write(matrix(1:5,nr=5,nc=1), file="ex_createDataset.h5", name="B", index=list(NULL,2))
# write character vector
h5write(ex_strings, file = "ex_createDataset.h5", name = "D")

h5dump("ex_createDataset.h5")

rhdf5

R Interface to HDF5

v2.34.0

Artistic-2.0

Authors

Bernd Fischer [aut], Mike Smith [aut, cre] (<https://orcid.org/0000-0002-7800-3848>), Gregoire Pau [aut], Martin Morgan [ctb], Daniel van Twisk [ctb]

Initial release

h5createDataset

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

rhdf5

We don't support your browser anymore

Sign In