Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

svyvif

Variance inflation factors (VIF) for linear models fitted with complex survey data


Description

Compute a VIF for fixed effects, linear regression models fitted with data collected from one- and two-stage complex survey designs.

Usage

svyvif(X, w, V)

Arguments

X

n \times p matrix of real-valued covariates used in fitting a linear regression; n = number of observations, p = number of covariates in model, excluding the intercept. A column of 1's for an intercept should not be included. X should not contain columns for the strata and cluster identifiers (unless those variables are part of the model). No missing values are allowed.

w

n-vector of survey weights used in fitting the model. No missing values are allowed.

V

n \times n covariance matrix of the residuals as estimated, e.g., using Vmat. No missing values are allowed.

Details

svyvif computes a variance inflation factor (VIF) appropriate for a model fitted from complex survey data (see Liao & Valliant 2012). A VIF measures the inflation of a slope estimate caused by nonorthogonality of the predictors over and above what the variance would be with orthogonality (Theil 1971; Belsley, Kuh, and Welsch 1980). The standard VIF equals 1/(1 - R^2_k) where R_k is the multiple correlation of the k^{th} column of X regressed on the remaining columns. The complex sample value of the VIF consists of the standard VIF multiplied by two adjustments denoted in the output as zeta and varrho. There is no widely agreed-upon cutoff value for identifying high values of a VIF.

Value

p \times 5 matrix with columns:

svy.vif

complex sample VIF

reg.vif

standard VIF, 1/(1 - R^2_k)

zeta

1st multiplicative adjustment to reg.vif

varrho

2nd multiplicative adjustment to reg.vif

zeta.x.varrho

product of the two adjustments to reg.vif

Author(s)

Richard Valliant

References

Belsley, D.A., Kuh, E. and Welsch, R.E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: Wiley-Interscience.

Liao, D, and Valliant, R. (2012). Variance inflation factors in the analysis of complex survey data. Survey Methodology, 38, 53-62.

Theil, H. (1971). Principles of Econometrics. New York: John Wiley & Sons, Inc.

Lumley, T. (2010). Complex Surveys. New York: John Wiley & Sons.

Lumley, T. (2018). survey: analysis of complex survey samples. R package version 3.34.

See Also

Examples

require(survey)
data(nhanes2007)
X1 <- nhanes2007[order(nhanes2007$SDMVSTRA, nhanes2007$SDMVPSU),]
    # eliminate cases with missing values
delete <- which(complete.cases(X1)==FALSE)
X2 <- X1[-delete,]
nhanes.dsgn <- svydesign(ids = ~SDMVPSU,
                         strata = ~SDMVSTRA,
                         weights = ~WTDRD1, nest=TRUE, data=X2)
m1 <- svyglm(BMXWT ~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL
            + DR1TTFAT + DR1TMFAT, design=nhanes.dsgn)
summary(m1)
V <- Vmat(mobj = m1,
          stvar = "SDMVSTRA",
          clvar = "SDMVPSU")
    # construct X matrix using model.matrix from stats package
X3 <- model.matrix(~ RIDAGEYR + as.factor(RIDRETH1) + DR1TKCAL + DR1TTFAT + DR1TMFAT,
        data = data.frame(X2))
    # remove col of 1's for intercept with X3[,-1]
svyvif(X = X3[,-1], w = X2$WTDRD1, V = V)

svydiags

Linear Regression Model Diagnostics for Survey Data

v0.3
GPL (>= 2)
Authors
Richard Valliant
Initial release
2018-12-13

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.