Write a matrix-like object as an HDF5-based sparse matrix
The 1.3 Million Brain Cell Dataset and other datasets published by 10x Genomics use an HDF5-based sparse matrix representation instead of the conventional (i.e. dense) HDF5 representation.
writeTENxMatrix
writes a matrix-like object to this format.
IMPORTANT NOTE: Only use writeTENxMatrix
if the matrix-like
object to write is sparse, that is, if most of its elements are zero.
Using writeTENxMatrix
on dense data is very inefficient!
In this case, you should use writeHDF5Array
instead.
writeTENxMatrix(x, filepath=NULL, group=NULL, level=NULL, verbose=NA)
x |
The matrix-like object to write to an HDF5 file. The object to write should typically be sparse, that is, most of its elements should be zero. If |
filepath |
|
group |
|
level |
The compression level to use for writing the data to disk.
By default, |
verbose |
Whether block processing progress should be displayed or not.
If set to |
Please note that, depending on the size of the data to write to disk
and the performance of the disk, writeTENxMatrix
can take a long
time to complete. Use verbose=TRUE
to see its progress.
Use setHDF5DumpFile
and setHDF5DumpName
to
control the location of automatically created HDF5 datasets.
A TENxMatrix object pointing to the newly written HDF5 data on disk.
TENxMatrix objects.
The TENxBrainData
dataset (in the
TENxBrainData package).
HDF5-dump-management to control the location and physical properties of automatically created HDF5 datasets.
h5ls
in the rhdf5 package.
## --------------------------------------------------------------------- ## A SIMPLE EXAMPLE ## --------------------------------------------------------------------- m0 <- matrix(0L, nrow=25, ncol=12, dimnames=list(letters[1:25], LETTERS[1:12])) m0[cbind(2:24, c(12:1, 2:12))] <- 100L + sample(55L, 23, replace=TRUE) out_file <- tempfile() M0 <- writeTENxMatrix(m0, out_file, group="m0") M0 sparsity(M0) path(M0) # same as 'out_file' ## Use the h5ls() command from the rhdf5 package to see the structure of ## the file: library(rhdf5) h5ls(path(M0)) ## --------------------------------------------------------------------- ## USING THE "1.3 Million Brain Cell Dataset" ## --------------------------------------------------------------------- ## The 1.3 Million Brain Cell Dataset from 10x Genomics is available via ## ExperimentHub: library(ExperimentHub) hub <- ExperimentHub() query(hub, "TENxBrainData") fname <- hub[["EH1039"]] oneM <- TENxMatrix(fname, "mm10") # see ?TENxMatrix for the details oneM ## Note that the following transformation preserves sparsity: M2 <- log(oneM + 1) # delayed M2 # a DelayedMatrix instance ## In order to reduce computation times, we'll write only the first ## 5000 columns of M2 to disk: out_file <- tempfile() M3 <- writeTENxMatrix(M2[ , 1:5000], out_file, group="mm10", verbose=TRUE) M3 # a TENxMatrix instance
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.