Additivity and variance stabilization for regression
Estimate transformations of x
and y
such that
the regression of y
on x
is approximately linear with
constant variance
avas(x, y, wt = rep(1, nrow(x)), cat = NULL, mon = NULL, lin = NULL, circ = NULL, delrsq = 0.01, yspan = 0)
x |
a matrix containing the independent variables. |
y |
a vector containing the response variable. |
wt |
an optional vector of weights. |
cat |
an optional integer vector specifying which variables
assume categorical values. Positive values in |
mon |
an optional integer vector specifying which variables are
to be transformed by monotone transformations. Positive values
in |
lin |
an optional integer vector specifying which variables are
to be transformed by linear transformations. Positive values in
|
circ |
an integer vector specifying which variables assume
circular (periodic) values. Positive values in |
delrsq |
termination threshold. Iteration stops when R-squared
changes by less than |
yspan |
Optional window size parameter for smoothing the variance. Range is [0,1]. Default is 0 (cross validated choice). .5 is a reasonable alternative to try. |
A structure with the following components:
x |
the input x matrix. |
y |
the input y vector. |
tx |
the transformed x values. |
ty |
the transformed y values. |
rsq |
the multiple R-squared value for the transformed values. |
l |
the codes for cat, mon, ... |
m |
not used in this version of avas |
yspan |
span used for smoothing the variance |
iters |
iteration number and rsq for that iteration |
niters |
number of iterations used |
Rob Tibshirani (1987), “Estimating optimal transformations for regression”. Journal of the American Statistical Association 83, 394ff.
TWOPI <- 8*atan(1) x <- runif(200,0,TWOPI) y <- exp(sin(x)+rnorm(200)/2) a <- avas(x,y) par(mfrow=c(3,1)) plot(a$y,a$ty) # view the response transformation plot(a$x,a$tx) # view the carrier transformation plot(a$tx,a$ty) # examine the linearity of the fitted model # From D. Wang and M. Murphy (2005), Identifying nonlinear relationships # regression using the ACE algorithm. Journal of Applied Statistics, # 32, 243-258, adapted for avas. X1 <- runif(100)*2-1 X2 <- runif(100)*2-1 X3 <- runif(100)*2-1 X4 <- runif(100)*2-1 # Original equation of Y: Y <- log(4 + sin(3*X1) + abs(X2) + X3^2 + X4 + .1*rnorm(100)) # Transformed version so that Y, after transformation, is a # linear function of transforms of the X variables: # exp(Y) = 4 + sin(3*X1) + abs(X2) + X3^2 + X4 a1 <- avas(cbind(X1,X2,X3,X4),Y) par(mfrow=c(2,1)) # For each variable, show its transform as a function of # the original variable and the of the transform that created it, # showing that the transform is recovered. plot(X1,a1$tx[,1]) plot(sin(3*X1),a1$tx[,1]) plot(X2,a1$tx[,2]) plot(abs(X2),a1$tx[,2]) plot(X3,a1$tx[,3]) plot(X3^2,a1$tx[,3]) plot(X4,a1$tx[,4]) plot(X4,a1$tx[,4]) plot(Y,a1$ty) plot(exp(Y),a1$ty)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.