Multinomial Logit Model
Fits a multinomial logit model to a (preferably unordered) factor response.
multinomial(zero = NULL, parallel = FALSE, nointercept = NULL, refLevel = "(Last)", whitespace = FALSE)
zero |
Can be an integer-valued vector specifying which
linear/additive predictors are modelled as intercepts only.
Any values must be from the set {1,2,...,M}.
The default value means none are modelled as intercept-only terms.
See |
parallel |
A logical, or formula specifying which terms have equal/unequal coefficients. |
nointercept, whitespace |
See |
refLevel |
Either a (1) single positive integer or (2) a value of the factor
or (3) a character string.
If inputted as an integer then it specifies which
column of the response matrix is the reference or baseline level.
The default is the last one (the (M+1)th one).
If used, this argument will be usually assigned the value |
In this help file the response Y is assumed to be a factor with unordered values 1,2,…,M+1, so that M is the number of linear/additive predictors eta_j.
The default model can be written
eta_j = log(P[Y=j]/ P[Y=M+1])
where eta_j is the jth linear/additive predictor.
Here, j=1,…,M, and eta_{M+1}
is 0 by definition. That is, the last level of the factor,
or last column of the response matrix, is taken as the
reference level or baseline—this is for identifiability
of the parameters. The reference or baseline level can
be changed with the refLevel
argument.
In almost all the literature, the constraint matrices
associated with this family of models are known.
For example, setting parallel = TRUE
will make all
constraint matrices
(including the intercept)
equal to a vector of M 1's;
to suppress the intercepts from being parallel then
set parallel = FALSE ~ 1
.
If the constraint matrices are
unknown and to be estimated, then this can be achieved by
fitting the model as a reduced-rank vector generalized
linear model (RR-VGLM; see rrvglm
).
In particular, a multinomial logit model with unknown
constraint matrices is known as a stereotype
model (Anderson, 1984), and can be fitted with
rrvglm
.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
,
rrvglm
and vgam
.
No check is made to verify that the response is nominal.
See CommonVGAMffArguments
for more warnings.
The multinomial logit model is more appropriate for a nominal
(unordered) factor response than for an ordinal (ordered) factor
response.
Models more suited for the latter include those based on
cumulative probabilities, e.g., cumulative
.
multinomial
is prone to numerical difficulties if
the groups are separable and/or the fitted probabilities
are close to 0 or 1. The fitted values returned
are estimates of the probabilities P[Y=j] for
j=1,…,M+1. See safeBinaryRegression
for the logistic regression case.
Here is an example of the usage of the parallel
argument. If there are covariates x2
, x3
and x4
, then parallel = TRUE ~ x2 + x3 -
1
and parallel = FALSE ~ x4
are equivalent. This
would constrain the regression coefficients for x2
and x3
to be equal; those of the intercepts and
x4
would be different.
In Example 4 below, a conditional logit model is
fitted to an artificial data set that explores how
cost and travel time affect people's decision about
how to travel to work. Walking is the baseline group.
The variable Cost.car
is the difference between
the cost of travel to work by car and walking, etc. The
variable Time.car
is the difference between
the travel duration/time to work by car and walking,
etc. For other details about the xij
argument see
vglm.control
and fill
.
The multinom
function in the
nnet package uses the first level of the factor as
baseline, whereas the last level of the factor is used
here. Consequently the estimated regression coefficients
differ.
Thomas W. Yee
Yee, T. W. (2010). The VGAM package for categorical data analysis. Journal of Statistical Software, 32, 1–34. https://www.jstatsoft.org/v32/i10/.
Yee, T. W. and Hastie, T. J. (2003). Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15–41.
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. London: Chapman & Hall.
Agresti, A. (2013). Categorical Data Analysis, 3rd ed. Hoboken, NJ, USA: Wiley.
Hastie, T. J., Tibshirani, R. J. and Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd ed. New York, USA: Springer-Verlag.
Simonoff, J. S. (2003). Analyzing Categorical Data, New York, USA: Springer-Verlag.
Anderson, J. A. (1984). Regression and ordered categorical variables. Journal of the Royal Statistical Society, Series B, Methodological, 46, 1–30.
Tutz, G. (2012). Regression for Categorical Data, Cambridge University Press.
# Example 1: fit a multinomial logit model to Edgar Anderson's iris data data(iris) ## Not run: fit <- vglm(Species ~ ., multinomial, iris) coef(fit, matrix = TRUE) ## End(Not run) # Example 2a: a simple example ycounts <- t(rmultinom(10, size = 20, prob = c(0.1, 0.2, 0.8))) # Counts fit <- vglm(ycounts ~ 1, multinomial) head(fitted(fit)) # Proportions fit@prior.weights # NOT recommended for extraction of prior weights weights(fit, type = "prior", matrix = FALSE) # The better method depvar(fit) # Sample proportions; same as fit@y constraints(fit) # Constraint matrices # Example 2b: Different reference level used as the baseline fit2 <- vglm(ycounts ~ 1, multinomial(refLevel = 2)) coef(fit2, matrix = TRUE) coef(fit , matrix = TRUE) # Easy to reconcile this output with fit2 # Example 3: The response is a factor. nn <- 10 dframe3 <- data.frame(yfac = gl(3, nn, labels = c("Ctrl", "Trt1", "Trt2")), x2 = runif(3 * nn)) myrefLevel <- with(dframe3, yfac[12]) fit3a <- vglm(yfac ~ x2, multinomial(refLevel = myrefLevel), dframe3) fit3b <- vglm(yfac ~ x2, multinomial(refLevel = 2), dframe3) coef(fit3a, matrix = TRUE) # "Trt1" is the reference level coef(fit3b, matrix = TRUE) # "Trt1" is the reference level margeff(fit3b) # Example 4: Fit a rank-1 stereotype model fit4 <- rrvglm(Country ~ Width + Height + HP, multinomial, data = car.all) coef(fit4) # Contains the C matrix constraints(fit4)$HP # The A matrix coef(fit4, matrix = TRUE) # The B matrix Coef(fit4)@C # The C matrix concoef(fit4) # Better to get the C matrix this way Coef(fit4)@A # The A matrix svd(coef(fit4, matrix = TRUE)[-1, ])$d # This has rank 1; = C %*% t(A) # Classification (but watch out for NAs in some of the variables): apply(fitted(fit4), 1, which.max) # Classification colnames(fitted(fit4))[apply(fitted(fit4), 1, which.max)] # Classification apply(predict(fit4, car.all, type = "response"), 1, which.max) # Ditto # Example 5: The use of the xij argument (aka conditional logit model) set.seed(111) nn <- 100 # Number of people who travel to work M <- 3 # There are M+1 models of transport to go to work ycounts <- matrix(0, nn, M+1) ycounts[cbind(1:nn, sample(x = M+1, size = nn, replace = TRUE))] = 1 dimnames(ycounts) <- list(NULL, c("bus","train","car","walk")) gotowork <- data.frame(cost.bus = runif(nn), time.bus = runif(nn), cost.train= runif(nn), time.train= runif(nn), cost.car = runif(nn), time.car = runif(nn), cost.walk = runif(nn), time.walk = runif(nn)) gotowork <- round(gotowork, digits = 2) # For convenience gotowork <- transform(gotowork, Cost.bus = cost.bus - cost.walk, Cost.car = cost.car - cost.walk, Cost.train = cost.train - cost.walk, Cost = cost.train - cost.walk, # for labelling Time.bus = time.bus - time.walk, Time.car = time.car - time.walk, Time.train = time.train - time.walk, Time = time.train - time.walk) # for labelling fit <- vglm(ycounts ~ Cost + Time, multinomial(parall = TRUE ~ Cost + Time - 1), xij = list(Cost ~ Cost.bus + Cost.train + Cost.car, Time ~ Time.bus + Time.train + Time.car), form2 = ~ Cost + Cost.bus + Cost.train + Cost.car + Time + Time.bus + Time.train + Time.car, data = gotowork, trace = TRUE) head(model.matrix(fit, type = "lm")) # LM model matrix head(model.matrix(fit, type = "vlm")) # Big VLM model matrix coef(fit) coef(fit, matrix = TRUE) constraints(fit) summary(fit) max(abs(predict(fit) - predict(fit, new = gotowork))) # Should be 0
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.