Look for keywords variable names and descriptions / Create a data dictionary
look_for
emulates the lookfor
Stata command in R. It supports
searching into the variable names of regular R data frames as well as into
variable labels descriptions.
The command is meant to help users finding variables in large datasets.
look_for( data, ..., labels = TRUE, ignore.case = TRUE, details = c("basic", "none", "full") ) lookfor( data, ..., labels = TRUE, ignore.case = TRUE, details = c("basic", "none", "full") ) generate_dictionary( data, ..., labels = TRUE, ignore.case = TRUE, details = c("basic", "none", "full") ) ## S3 method for class 'look_for' print(x, ...) convert_list_columns_to_character(x) lookfor_to_long_format(x)
data |
a data frame |
... |
optional list of keywords, a character string (or several character strings), which can be
formatted as a regular expression suitable for a |
labels |
whether or not to search variable labels (descriptions); |
ignore.case |
whether or not to make the keywords case sensitive;
|
details |
add details about each variable (full details could be time consuming for big data frames, |
x |
a tibble returned by |
When no keyword is provided, it will produce a data dictionary of the overall data frame.
The function looks into the variable names for matches to the keywords. If available,
variable labels are included in the search scope.
Variable labels of data.frame imported with foreign or
memisc packages will also be taken into account (see to_labelled()
). If no keyword is
provided, it will return all variables of data
.
look_for()
, lookfor()
and generate_dictionary()
are equivalent.
By default, results will be summarized when printing. To deactivate default printing,
use dplyr::as_tibble()
.
lookfor_to_long_format()
could be used to transform results with one row per factor level
and per value label.
Use convert_list_columns_to_character()
to convert named list columns into character vectors
(see examples).
a tibble data frame featuring the variable position, name and description (if it exists) in the original data frame
François Briatte f.briatte@gmail.com, Joseph Larmarange joseph@larmarange.net
Based on the behavior of the lookfor
command in Stata.
look_for(iris) # Look for a single keyword. look_for(iris, "petal") look_for(iris, "s") # Look for with a regular expression look_for(iris, "petal|species") look_for(iris, "s$") # Look for with several keywords look_for(iris, "pet", "sp") look_for(iris, "pet", "sp", "width") look_for(iris, "Pet", "sp", "width", ignore.case = FALSE) # Quicker search without variable details look_for(iris, details = "none") # To obtain more details about each variable look_for(iris, details = "full") # To deactivate default printing, convert to tibble look_for(iris, details = "full") %>% dplyr::as_tibble() # To convert named lists into character vectors look_for(iris) %>% convert_list_columns_to_character() # Long format with one row per factor and per value label look_for(iris) %>% lookfor_to_long_format() # Both functions can be combined look_for(iris) %>% lookfor_to_long_format() %>% convert_list_columns_to_character() # Labelled data ## Not run: data(fertility, package = "questionr") look_for(children) look_for(children, "id") look_for(children) %>% lookfor_to_long_format() %>% convert_list_columns_to_character() ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.