Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

multi.compare

Multivariate comparison of synthesised and observed data


Description

Graphical comparisons of a variable (var) in the synthesised data set with the original (observed) data set within subgroups defined by the variables in a vector by. var can be a factor or a continuous variable and the plots produced will depend on the class of var. The variables in by will usually be factors or variables with only a few values.

Usage

multi.compare(object, data, var = NULL, by = NULL, msel = NULL, 
  barplot.position = "fill", cont.type = "hist", y.hist = "count", 
  boxplot.point = TRUE, binwidth = NULL, ...)

Arguments

object

an object of class synds, which stands for 'synthesised data set'. It is typically created by function syn() and it includes object$m synthesised data set(s).

data

an original (observed) data set.

var

variable to be compared between observed and synthetic data within subgroups.

by

variables to be tabulated or cross-tabulated to form groups.

barplot.position

type of barplot. The default "fill" gives a single bar with the proportions in each group while "dodge" gives side-by-side bars with the numbers in each category.

cont.type

default "hist" gives histograms and "boxplot" gives boxplots.

y.hist

defines y scale for histograms - "count" is default; "density" gives proportions.

boxplot.point

default (TRUE) adds individual points to boxplots.

msel

numbers of synthetic data sets to be used - must be numbers in the range 1:object$m - defaults to 1:object$m

binwidth

sets width of a bin for histograms.

...

additional parameters that can be supplied to ggplot.

Value

Plots as specified above. A table of the numbers in the subgroups is printed to the R console.

See Also

Examples

### default synthesis of selected variables
vars <- c("sex", "age", "edu", "smoke")
ods  <- na.omit(SD2011[1:1000, vars])
s1 <- syn(ods)

### categorical var
multi.compare(s1, ods, var = "smoke", by = c("sex","edu"))

### numeric var
multi.compare(s1, ods, var = "age", by = c("sex"), y.hist = "density", binwidth = 5)
multi.compare(s1, ods, var = "age", by = c("sex", "edu"), cont.type = "boxplot")

synthpop

Generating Synthetic Versions of Sensitive Microdata for Statistical Disclosure Control

v1.6-0
GPL-2 | GPL-3
Authors
Beata Nowok [aut, cre], Gillian M Raab [aut], Chris Dibben [ctb], Joshua Snoke [ctb], Caspar van Lissa [ctb]
Initial release
2020-09-03

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.