Impute Numeric Data Below the Threshold of Measurement
step_impute_lower
creates a specification of a recipe step
designed for cases where the non-negative numeric data cannot be
measured below a known value. In these cases, one method for
imputing the data is to substitute the truncated value by a
random uniform number between zero and the truncation point.
step_impute_lower( recipe, ..., role = NA, trained = FALSE, threshold = NULL, skip = FALSE, id = rand_id("impute_lower") ) step_lowerimpute( recipe, ..., role = NA, trained = FALSE, threshold = NULL, skip = FALSE, id = rand_id("impute_lower") ) ## S3 method for class 'step_impute_lower' tidy(x, ...)
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which
variables are affected by the step. See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
threshold |
A named numeric vector of lower bounds. This is
|
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
x |
A |
step_impute_lower
estimates the variable minimums
from the data used in the training
argument of prep.recipe
.
bake.recipe
then simulates a value for any data at the minimum
with a random uniform value between zero and the minimum.
As of recipes
0.1.16, this function name changed from step_lowerimpute()
to step_impute_lower()
.
An updated version of recipe
with the new step
added to the sequence of existing steps (if any). For the
tidy
method, a tibble with columns terms
(the
selectors or variables selected) and value
for the estimated
threshold.
library(recipes) library(modeldata) data(biomass) ## Truncate some values to emulate what a lower limit of ## the measurement system might look like biomass$carbon <- ifelse(biomass$carbon > 40, biomass$carbon, 40) biomass$hydrogen <- ifelse(biomass$hydrogen > 5, biomass$carbon, 5) biomass_tr <- biomass[biomass$dataset == "Training",] biomass_te <- biomass[biomass$dataset == "Testing",] rec <- recipe(HHV ~ carbon + hydrogen + oxygen + nitrogen + sulfur, data = biomass_tr) impute_rec <- rec %>% step_impute_lower(carbon, hydrogen) tidy(impute_rec, number = 1) impute_rec <- prep(impute_rec, training = biomass_tr) tidy(impute_rec, number = 1) transformed_te <- bake(impute_rec, biomass_te) plot(transformed_te$carbon, biomass_te$carbon, ylab = "pre-imputation", xlab = "imputed")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.