Read and write Stata DTA files
Currently haven can read and write logical, integer, numeric, character
and factors. See labelled()
for how labelled variables in
Stata are handled in R.
read_dta( file, encoding = NULL, col_select = NULL, skip = 0, n_max = Inf, .name_repair = "unique" ) read_stata( file, encoding = NULL, col_select = NULL, skip = 0, n_max = Inf, .name_repair = "unique" ) write_dta(data, path, version = 14, label = attr(data, "label"))
file |
Either a path to a file, a connection, or literal data (either a single string or a raw vector). Files ending in Literal data is most useful for examples and tests. It must contain at least one new line to be recognised as data (instead of a path) or be a vector of greater than length 1. Using a value of |
encoding |
The character encoding used for the file. Generally, only needed for Stata 13 files and earlier. See Encoding section for details. |
col_select |
One or more selection expressions, like in
|
skip |
Number of lines to skip before reading data. |
n_max |
Maximum number of lines to read. |
.name_repair |
Treatment of problematic column names:
This argument is passed on as |
data |
Data frame to write. |
path |
Path to a file where the data will be written. |
version |
File version to use. Supports versions 8-15. |
label |
Dataset label to use, or |
A tibble, data frame variant with nice defaults.
Variable labels are stored in the "label" attribute of each variable. It is not printed on the console, but the RStudio viewer will show it.
If a dataset label is defined in Stata, it will stored in the "label" attribute of the tibble.
write_dta()
returns the input data
invisibly.
Prior to Stata 14, files did not declare a text encoding, and the
default encoding differed across platforms. If encoding = NULL
,
haven assumes the encoding is windows-1252, the text encoding used by
Stata on Windows. Unfortunately Stata on Mac and Linux use a different
default encoding, "latin1". If you encounter an error such as
"Unable to convert string to the requested encoding", try
encoding = "latin1"
For Stata 14 and later, you should not need to manually specify encoding
value unless the value was incorrectly recorded in the source file.
path <- system.file("examples", "iris.dta", package = "haven") read_dta(path) tmp <- tempfile(fileext = ".dta") write_dta(mtcars, tmp) read_dta(tmp) read_stata(tmp)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.