Multivariate comparison of synthesised and observed data
Graphical comparisons of a variable (var
) in the synthesised data set
with the original (observed) data set within subgroups defined by the
variables in a vector by
. var
can be a factor or a continuous
variable and the plots produced will depend on the class of var
.
The variables in by
will usually be factors or variables with only
a few values.
multi.compare(object, data, var = NULL, by = NULL, msel = NULL, barplot.position = "fill", cont.type = "hist", y.hist = "count", boxplot.point = TRUE, binwidth = NULL, ...)
object |
an object of class |
data |
an original (observed) data set. |
var |
variable to be compared between observed and synthetic data within subgroups. |
by |
variables to be tabulated or cross-tabulated to form groups. |
barplot.position |
type of barplot. The default |
cont.type |
default |
y.hist |
defines y scale for histograms - |
boxplot.point |
default ( |
msel |
numbers of synthetic data sets to be used - must be numbers in
the range |
binwidth |
sets width of a bin for histograms. |
... |
additional parameters that can be supplied to |
Plots as specified above. A table of the numbers in the subgroups is printed to the R console.
### default synthesis of selected variables vars <- c("sex", "age", "edu", "smoke") ods <- na.omit(SD2011[1:1000, vars]) s1 <- syn(ods) ### categorical var multi.compare(s1, ods, var = "smoke", by = c("sex","edu")) ### numeric var multi.compare(s1, ods, var = "age", by = c("sex"), y.hist = "density", binwidth = 5) multi.compare(s1, ods, var = "age", by = c("sex", "edu"), cont.type = "boxplot")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.