Impute Numeric Data Below the Threshold of Measurement
step_impute_lower creates a specification of a recipe step
designed for cases where the non-negative numeric data cannot be
measured below a known value. In these cases, one method for
imputing the data is to substitute the truncated value by a
random uniform number between zero and the truncation point.
step_impute_lower(
recipe,
...,
role = NA,
trained = FALSE,
threshold = NULL,
skip = FALSE,
id = rand_id("impute_lower")
)
step_lowerimpute(
recipe,
...,
role = NA,
trained = FALSE,
threshold = NULL,
skip = FALSE,
id = rand_id("impute_lower")
)
## S3 method for class 'step_impute_lower'
tidy(x, ...)recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which
variables are affected by the step. See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
threshold |
A named numeric vector of lower bounds. This is
|
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
x |
A |
step_impute_lower estimates the variable minimums
from the data used in the training argument of prep.recipe.
bake.recipe then simulates a value for any data at the minimum
with a random uniform value between zero and the minimum.
As of recipes 0.1.16, this function name changed from step_lowerimpute()
to step_impute_lower().
An updated version of recipe with the new step
added to the sequence of existing steps (if any). For the
tidy method, a tibble with columns terms (the
selectors or variables selected) and value for the estimated
threshold.
library(recipes)
library(modeldata)
data(biomass)
## Truncate some values to emulate what a lower limit of
## the measurement system might look like
biomass$carbon <- ifelse(biomass$carbon > 40, biomass$carbon, 40)
biomass$hydrogen <- ifelse(biomass$hydrogen > 5, biomass$carbon, 5)
biomass_tr <- biomass[biomass$dataset == "Training",]
biomass_te <- biomass[biomass$dataset == "Testing",]
rec <- recipe(HHV ~ carbon + hydrogen + oxygen + nitrogen + sulfur,
data = biomass_tr)
impute_rec <- rec %>%
step_impute_lower(carbon, hydrogen)
tidy(impute_rec, number = 1)
impute_rec <- prep(impute_rec, training = biomass_tr)
tidy(impute_rec, number = 1)
transformed_te <- bake(impute_rec, biomass_te)
plot(transformed_te$carbon, biomass_te$carbon,
ylab = "pre-imputation", xlab = "imputed")Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.