Create, modify, and delete columns
mutate()
adds new variables and preserves existing ones;
transmute()
adds new variables and drops existing ones.
New variables overwrite existing variables of the same name.
Variables can be removed by setting their value to NULL
.
mutate(.data, ...) ## S3 method for class 'data.frame' mutate( .data, ..., .keep = c("all", "used", "unused", "none"), .before = NULL, .after = NULL ) transmute(.data, ...)
.data |
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details. |
... |
< The value can be:
|
.keep |
This is an experimental argument that allows you to control which columns
from
Grouping variables are always kept, unconditional to |
.before, .after |
< |
An object of the same type as .data
. The output has the following
properties:
Rows are not affected.
Existing columns will be preserved according to the .keep
argument.
New columns will be placed according to the .before
and .after
arguments. If .keep = "none"
(as in transmute()
), the output order
is determined only by ...
, not the order of existing columns.
Columns given value NULL
will be removed
Groups will be recomputed if a grouping variable is mutated.
Data frame attributes are preserved.
Because mutating expressions are computed within groups, they may yield different results on grouped tibbles. This will be the case as soon as an aggregating, lagging, or ranking function is involved. Compare this ungrouped mutate:
starwars %>% select(name, mass, species) %>% mutate(mass_norm = mass / mean(mass, na.rm = TRUE))
With the grouped equivalent:
starwars %>% select(name, mass, species) %>% group_by(species) %>% mutate(mass_norm = mass / mean(mass, na.rm = TRUE))
The former normalises mass
by the global average whereas the
latter normalises by the averages within species levels.
These function are generics, which means that packages can provide implementations (methods) for other classes. See the documentation of individual methods for extra arguments and differences in behaviour.
Methods available in currently loaded packages:
mutate()
: no methods found.
transmute()
: no methods found.
# Newly created variables are available immediately starwars %>% select(name, mass) %>% mutate( mass2 = mass * 2, mass2_squared = mass2 * mass2 ) # As well as adding new variables, you can use mutate() to # remove variables and modify existing variables. starwars %>% select(name, height, mass, homeworld) %>% mutate( mass = NULL, height = height * 0.0328084 # convert to feet ) # Use across() with mutate() to apply a transformation # to multiple columns in a tibble. starwars %>% select(name, homeworld, species) %>% mutate(across(!name, as.factor)) # see more in ?across # Window functions are useful for grouped mutates: starwars %>% select(name, mass, homeworld) %>% group_by(homeworld) %>% mutate(rank = min_rank(desc(mass))) # see `vignette("window-functions")` for more details # By default, new columns are placed on the far right. # Experimental: you can override with `.before` or `.after` df <- tibble(x = 1, y = 2) df %>% mutate(z = x + y) df %>% mutate(z = x + y, .before = 1) df %>% mutate(z = x + y, .after = x) # By default, mutate() keeps all columns from the input data. # Experimental: You can override with `.keep` df <- tibble(x = 1, y = 2, a = "a", b = "b") df %>% mutate(z = x + y, .keep = "all") # the default df %>% mutate(z = x + y, .keep = "used") df %>% mutate(z = x + y, .keep = "unused") df %>% mutate(z = x + y, .keep = "none") # same as transmute() # Grouping ---------------------------------------- # The mutate operation may yield different results on grouped # tibbles because the expressions are computed within groups. # The following normalises `mass` by the global average: starwars %>% select(name, mass, species) %>% mutate(mass_norm = mass / mean(mass, na.rm = TRUE)) # Whereas this normalises `mass` by the averages within species # levels: starwars %>% select(name, mass, species) %>% group_by(species) %>% mutate(mass_norm = mass / mean(mass, na.rm = TRUE)) # Indirection ---------------------------------------- # Refer to column names stored as strings with the `.data` pronoun: vars <- c("mass", "height") mutate(starwars, prod = .data[[vars[[1]]]] * .data[[vars[[2]]]]) # Learn more in ?dplyr_data_masking
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.