Assign missing categories to "unknown"
step_unknown
creates a specification of a recipe
step that will assign a missing value in a factor level to"unknown".
step_unknown( recipe, ..., role = NA, trained = FALSE, new_level = "unknown", objects = NULL, skip = FALSE, id = rand_id("unknown") ) ## S3 method for class 'step_unknown' tidy(x, ...)
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which
variables that will be affected by the step. These variables
should be character or factor types. See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
new_level |
A single character value that will be assigned to new factor levels. |
objects |
A list of objects that contain the information
on factor levels that will be determined by |
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
x |
A |
The selected variables are adjusted to have a new
level (given by new_level
) that is placed in the last
position.
Note that if the original columns are character, they will be converted to factors by this step.
If new_level
is already in the data given to prep
, an error
is thrown.
An updated version of recipe
with the new step
added to the sequence of existing steps (if any). For the
tidy
method, a tibble with columns terms
(the
columns that will be affected) and value
(the factor
levels that is used for the new value)
library(modeldata) data(okc) rec <- recipe(~ diet + location, data = okc) %>% step_unknown(diet, new_level = "unknown diet") %>% step_unknown(location, new_level = "unknown location") %>% prep() table(bake(rec, new_data = NULL) %>% pull(diet), okc %>% pull(diet), useNA = "always") %>% as.data.frame() %>% dplyr::filter(Freq > 0) tidy(rec, number = 1)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.