Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

sdc

Tools for statistical disclosure control (sdc)


Description

Labeling and removing unique replicates of unique actual (observed) individuals.

Usage

sdc(object, data, label = NULL, rm.replicated.uniques = FALSE, 
 uniques.exclude = NULL, recode.vars = NULL, bottom.top.coding = NULL, 
 recode.exclude = NULL, smooth.vars = NULL)

Arguments

object

an object of class synds, which stands for 'synthesised data set'. It is typically created by function syn() and it includes object$m synthesised data set(s).

data

the original (observed) data set.

label

a single string with a label to be added to the synthetic data sets as a new variable to make it clear that the data are synthetic/fake.

rm.replicated.uniques

a logical value indicating whether unique replicates of units that are unique also in the orginal data set should be removed.

uniques.exclude

a single string or a vector of strings with name(s) of variable(s) to be excluded from the identification of uniques.

recode.vars

a single string or a vector of strings with name(s) of variable(s) to be bottom- or/and top-coded.

bottom.top.coding

a list of two-element vectors specifing bottom and top codes for each variable in recode.vars. If there is no need for bottom or top coding NA should be used. If only one variable is to be recoded, codes can be given as a two-element vector.

recode.exclude

a list specifying for each variable in recode.vars values to be excluded from recoding, e.g. missing data codes. If all values should be considered for recoding NA should be used. If only one variable is to be recoded, code(s) can be given as a single number or a vector.

smooth.vars

a single string or a vector of strings with name(s) of numeric variable(s) to be smoothed (smooth.spline function is used).

Value

An object provided as an argument adjusted in accordance with the other parameters' values.

See Also

Examples

ods <- SD2011[1:1000,c("sex","age","edu","marital","income")]
s1 <- syn(ods, m = 2)
s1.sdc <- sdc(s1, ods, label="false_data", rm.replicated.uniques = TRUE,
recode.vars = c("age","income"),
bottom.top.coding = list(c(20,80),c(NA,2000)),
recode.exclude = list(NA,c(NA,-8)))

synthpop

Generating Synthetic Versions of Sensitive Microdata for Statistical Disclosure Control

v1.6-0
GPL-2 | GPL-3
Authors
Beata Nowok [aut, cre], Gillian M Raab [aut], Chris Dibben [ctb], Joshua Snoke [ctb], Caspar van Lissa [ctb]
Initial release
2020-09-03

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.