Convert Numbers to Factors
step_num2factor
will convert one or more numeric vectors to factors
(ordered or unordered). This can be useful when categories are encoded as
integers.
step_num2factor( recipe, ..., role = NA, transform = function(x) x, trained = FALSE, levels, ordered = FALSE, skip = FALSE, id = rand_id("num2factor") ) ## S3 method for class 'step_num2factor' tidy(x, ...)
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which variables will be
converted to factors. See |
role |
Not used by this step since no new variables are created. |
transform |
A function taking a single argument |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
levels |
A character vector of values that will be used as the levels.
These are the numeric data converted to character and ordered. This is
modified once |
ordered |
A single logical value; should the factor(s) be ordered? |
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
x |
A |
An updated version of recipe
with the new step added to the
sequence of existing steps (if any). For the tidy
method, a tibble with
columns terms
(the selectors or variables selected) and ordered
.
library(dplyr) library(modeldata) data(attrition) attrition %>% group_by(StockOptionLevel) %>% count() amnt <- c("nothin", "meh", "some", "copious") rec <- recipe(Attrition ~ StockOptionLevel, data = attrition) %>% step_num2factor( StockOptionLevel, transform = function(x) x + 1, levels = amnt ) encoded <- rec %>% prep() %>% bake(new_data = NULL) table(encoded$StockOptionLevel, attrition$StockOptionLevel) # an example for binning binner <- function(x) { x <- cut(x, breaks = 1000 * c(0, 5, 10, 20), include.lowest = TRUE) # now return the group number as.numeric(x) } inc <- c("low", "med", "high") rec <- recipe(Attrition ~ MonthlyIncome, data = attrition) %>% step_num2factor( MonthlyIncome, transform = binner, levels = inc, ordered = TRUE ) %>% prep() encoded <- bake(rec, new_data = NULL) table(encoded$MonthlyIncome, binner(attrition$MonthlyIncome)) # What happens when a value is out of range? ceo <- attrition %>% slice(1) %>% mutate(MonthlyIncome = 10^10) bake(rec, ceo)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.