Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

stat_mean_sd_text

Add Text Indicating the Mean and Standard Deviation to a ggplot2 Plot


Description

For a strip plot or scatterplot produced using the package ggplot2 (e.g., with geom_point), for each value on the x-axis, add text indicating the mean and standard deviation of the y-values for that particular x-value.

Usage

stat_mean_sd_text(mapping = NULL, data = NULL, 
    geom = ifelse(text.box, "label", "text"), 
    position = "identity", na.rm = FALSE, show.legend = NA, 
    inherit.aes = TRUE, y.pos = NULL, y.expand.factor = 0.2, 
    digits = 1, digit.type = "round", 
    nsmall = ifelse(digit.type == "round", digits, 0), text.box = FALSE, 
    alpha = 1, angle = 0, color = "black", family = "", fontface = "plain", 
    hjust = 0.5, label.padding = ggplot2::unit(0.25, "lines"), 
    label.r = ggplot2::unit(0.15, "lines"), label.size = 0.25, 
    lineheight = 1.2, size = 4, vjust = 0.5, ...)

Arguments

mapping, data, position, na.rm, show.legend, inherit.aes

See the help file for geom_text.

geom

Character string indicating which geom to use to display the text. Setting geom="text" will use geom_text to display the text, and setting geom="label" will use geom_label to display the text. The default value is geom="text" unless the user sets text.box=TRUE.

y.pos

Numeric scalar indicating the y-position of the text (i.e., the value of the argument y that will be used in the call to geom_text or geom_label). The default value is y.pos=NULL, in which case y.pos is set to the maximum value of all y-values plus a proportion of the range of all y-values, where the proportion is determined by the argument y.expand.factor (see below).

y.expand.factor

For the case when y.pos=NULL, a numeric scalar indicating the proportion by which the range of all y-values should be multiplied by before adding this value to the maximum value of all y-values in order to compute the value of the argument y.pos (see above). The default value is y.expand.factor=0.2.

digits

Integer indicating the number of digits to use for displaying the mean and standard deviation. When digit.type="round" (see below) the argument digits indicates the number of digits to round to, and when digit.type="signif" the argument digits indicates the number of significant digits to display. The default value is digits=1.

digit.type

Character string indicating whether the digits argument (see above) refers to significant digits (digit.type="signif"), or how many decimal places to round to (digit.type="round", the default).

nsmall

Integer passed to the function format indicating the the minimum number of digits to the right of the decimal point for the computed mean and standard deviation. The default value is nsmall=digits when digit.type="round" and nsmall=0 when digit.type="signif". When nsmall is greater than 0, all computed means and standard deviations will have the same number of digits to the right of the decimal point (including, possibly, trailing zeros). To omit trailing zeros, set nsmall=0.

text.box

Logical scalar indicating whether to surround the text with a text box (i.e., whether to use geom_label instead of geom_text). This argument can be overridden by simply specifying the argument geom.

alpha, angle, color, family, fontface, hjust, vjust, lineheight, size

See the help file for geom_text and the vignette Aesthetic specifications at https://cran.r-project.org/package=ggplot2/vignettes/ggplot2-specs.html.

label.padding, label.r, label.size

See the help file for geom_text.

...

Other arguments passed on to layer.

Details

See the help file for geom_text for details about how geom_text and geom_label work.

See the vignette Extending ggplot2 at https://cran.r-project.org/package=ggplot2/vignettes/extending-ggplot2.html for information on how to create a new stat.

Note

The function stat_mean_sd_text is called by the function geom_stripchart.

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis (Use R!). Second Edition. Springer.

See Also

Examples

# First, load and attach the ggplot2 package.
  #--------------------------------------------

  library(ggplot2)

  #====================

  # Example 1:

  # Using the built-in data frame mtcars, 
  # plot miles per gallon vs. number of cylinders
  # using different colors for each level of the number of cylinders.
  #------------------------------------------------------------------

  p <- ggplot(mtcars, aes(x = factor(cyl), y = mpg, color = factor(cyl))) + 
    theme(legend.position = "none")

  p + geom_point() + 
    labs(x = "Number of Cylinders", y = "Miles per Gallon")

  # Now add text indicating the mean and standard deviation 
  # for each level of cylinder.
  #--------------------------------------------------------

  dev.new()
  p + geom_point() + 
    stat_mean_sd_text() + 
    labs(x = "Number of Cylinders", y = "Miles per Gallon")
  
  #====================

  # Example 2:

  # Repeat Example 1, but:
  # 1) facet by transmission type, 
  # 2) make the size of the text smaller.
  #--------------------------------------

  dev.new()
  p + geom_point() + 
    stat_mean_sd_text(size = 3) + 
    facet_wrap(~ am, labeller = label_both) +  
    labs(x = "Number of Cylinders", y = "Miles per Gallon")
 
  #====================

  # Example 3:

  # Repeat Example 1, but specify the y-position for the text.
  #-----------------------------------------------------------

  dev.new()
  p + geom_point() + 
    stat_mean_sd_text(y.pos = 36) + 
    labs(x = "Number of Cylinders", y = "Miles per Gallon")
  
  #====================

  # Example 4:

  # Repeat Example 1, but show the 
  # mean and standard deviation in a text box.
  #-------------------------------------------

  dev.new()
  p + geom_point() + 
    stat_mean_sd_text(text.box = TRUE) + 
    labs(x = "Number of Cylinders", y = "Miles per Gallon")

  #====================

  # Example 5:

  # Repeat Example 1, but use the color brown for the text.
  #--------------------------------------------------------
 
  dev.new()
  p + geom_point() + 
    stat_mean_sd_text(color = "brown") + 
    labs(x = "Number of Cylinders", y = "Miles per Gallon")

  #====================

  # Example 6:

  # Repeat Example 1, but:
  # 1) use the same colors for the text that are used for each group, 
  # 2) use the bold monospaced font.
  #------------------------------------------------------------------
 
  mat <- ggplot_build(p)$data[[1]]
  group <- mat[, "group"]
  colors <- mat[match(1:max(group), group), "colour"]

  dev.new()
  p + geom_point() + 
    stat_mean_sd_text(color = colors, size = 5, 
      family = "mono", fontface = "bold") + 
    labs(x = "Number of Cylinders", y = "Miles per Gallon")

  #====================

  # Clean up
  #---------

  graphics.off()
  rm(p, mat, group, colors)

EnvStats

Package for Environmental Statistics, Including US EPA Guidance

v2.4.0
GPL (>= 3)
Authors
Steven P. Millard [aut], Alexander Kowarik [ctb, cre] (<https://orcid.org/0000-0001-8598-4130>)
Initial release
2020-10-20

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.