Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

glm.synds

Fitting (generalized) linear models to synthetic data


Description

Fits generalized linear models or simple linear models to the synthesised data set(s) using glm and lm function respectively.

Usage

glm.synds(formula, family = "binomial", data,  ...)
lm.synds(formula, data, ...)

## S3 method for class 'fit.synds'
print(x, msel = NULL, ...)

Arguments

formula

a symbolic description of the model to be estimated. A typical model has the form response ~ predictors. See the documentation of glm and formula for details.

family

a description of the error distribution and link function to be used in the model. See the documentation of glm and family for details.

data

an object of class synds, which stands for 'synthesised data set'. It is typically created by function syn and it includes data$m synthesised data set(s).

...

additional parameters passed to glm or lm.

x

an object of class fit.synds.

msel

index or indices of synthetic data copies for which coefficient estimates are to be displayed. If NULL (default) the combined (average) coefficient estimates are printed.

Value

The summary function (summary.fit.synds) can be used to obtain the combined results of models fitted to each of the m synthetic data sets.

An object of class fit.synds. It is a list with the following components:

call

the original call to glm.synds or lm.synds.

mcoefavg

combined (average) coefficient estimates.

mvaravg

combined (average) variance estimates of mcoef.

analyses

summary.glm or summary.lm object respectively or a list of m such objects.

fitting.function

function used to fit the model.

n

a number of cases in the original data.

k

a number of cases in the synthesised data.

proper

a logical value indicating whether synthetic data were generated using proper synthesis.

m

the number of synthetic versions of the observed data.

method

a vector of synthesising methods applied to each variable in the saved synthesised data.

incomplete

a logical value indicating whether any of the variables in the model were not synthesised.

mcoef

a matrix of coefficients estimates from all m syntheses.

mvar

a matrix of variance estimates from all m syntheses.

See Also

Examples

### Logit model 
ods <- SD2011[1:1000, c("sex", "age", "edu", "marital", "ls", "smoke")]
s1 <- syn(ods, m = 3)
f1 <- glm.synds(smoke ~ sex + age + edu + marital + ls, data = s1, family = "binomial")
f1
print(f1, msel = 1:2)
  
### Linear model
ods <- SD2011[1:1000,c("sex", "age", "income", "marital", "depress")]
ods$income[ods$income == -8] <- NA
s2 <- syn(ods, m = 3)
f2 <- lm.synds(depress ~ sex + age + log(income) + marital, data = s2)
f2
print(f2,1:3)

synthpop

Generating Synthetic Versions of Sensitive Microdata for Statistical Disclosure Control

v1.6-0
GPL-2 | GPL-3
Authors
Beata Nowok [aut, cre], Gillian M Raab [aut], Chris Dibben [ctb], Joshua Snoke [ctb], Caspar van Lissa [ctb]
Initial release
2020-09-03

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.