Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

read_parquet

Read a Parquet file


Description

'Parquet' is a columnar storage file format. This function enables you to read Parquet files into R.

Usage

read_parquet(
  file,
  col_select = NULL,
  as_data_frame = TRUE,
  props = ParquetArrowReaderProperties$create(),
  ...
)

Arguments

file

A character file name or URI, raw vector, an Arrow input stream, or a FileSystem with path (SubTreeFileSystem). If a file name or URI, an Arrow InputStream will be opened and closed when finished. If an input stream is provided, it will be left open.

col_select

A character vector of column names to keep, as in the "select" argument to data.table::fread(), or a tidy selection specification of columns, as used in dplyr::select().

as_data_frame

Should the function return a data.frame (default) or an Arrow Table?

props

ParquetArrowReaderProperties

...

Additional arguments passed to ParquetFileReader$create()

Value

A arrow::Table, or a data.frame if as_data_frame is TRUE (the default).

Examples

## Not run: 
tf <- tempfile()
on.exit(unlink(tf))
write_parquet(mtcars, tf)
df <- read_parquet(tf, col_select = starts_with("d"))
head(df)

## End(Not run)

arrow

Integration to 'Apache' 'Arrow'

v4.0.0.1
Apache License (>= 2.0)
Authors
Neal Richardson [aut, cre], Ian Cook [aut], Jonathan Keane [aut], Romain François [aut] (<https://orcid.org/0000-0002-2444-4226>), Jeroen Ooms [aut], Javier Luraschi [ctb], Jeffrey Wong [ctb], Apache Arrow [aut, cph]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.