Polyclass: polychotomous regression and multiple classification
Fit a polychotomous regression and multiple classification using linear splines and selected tensor products.
polyclass(data, cov, weight, penalty, maxdim, exclude, include, additive = FALSE, linear, delete = 2, fit, silent = TRUE, normweight = TRUE, tdata, tcov, tweight, cv, select, loss, seed)
data |
vector of classes:
|
cov |
covariates: matrix with as many rows as the length of |
weight |
optional vector of case-weights. Should have the same length as
|
penalty |
the parameter to be used in the AIC criterion if the
model selection is carried out by AIC. The program chooses
the number of knots that minimizes |
maxdim |
maximum dimension (default is
\code{min(n, 4 * n^(1/3) * (cl - 1)}, where
n is |
exclude |
combinations to be excluded - this should be a matrix with 2
columns - if for example |
include |
those combinations that can be included. Should have the same format
as |
additive |
should the model selection be restricted to additive models? |
linear |
vector indicating for which of the variables no knots should
be entered. For example, if |
delete |
should complete basis functions be deleted at once (2), should only individual dimensions be deleted (1) or should only the addition stage of the model selection be carried out (0)? |
fit |
|
silent |
suppresses the printing of diagnostic output about basis functions added or deleted, Rao-statistics, Wald-statistics and log-likelihoods. |
normweight |
should the weights be normalized so that they average to one? This option has only an effect if the model is selected using AIC. |
tdata,tcov,tweight |
test set. Should satisfy the same requirements as |
cv |
in how many subsets should the data be divided for cross-validation? If |
select |
if a test set is provided, or if the model is selected using cross validation, should the model be select that minimizes (misclassification) loss (0), that maximizes test set log-likelihood (1) or that minimizes test set squared error loss (2)? |
loss |
a rectangular matrix specifying the loss function, whose
size is the number of
classes times number of actions.
Used for cross-validation and test set model
selection. |
seed |
optional
seed for the random number generator that determines the sequence of the
cases for cross-validation. If the seed has length 12 or more,
the first twelve elements are assumed to be |
The output is an object of class polyclass
, organized
to serve as input for plot.polyclass
,
beta.polyclass
,
summary.polyclass
, ppolyclass
(fitted probabilities),
cpolyclass
(fitted classes) and rpolyclass
(random classes).
The function returns a list with the following members:
call |
the command that was executed. |
ncov |
number of covariates. |
ndim |
number of dimensions of the fitted model. |
nclass |
number of classes. |
nbas |
number of basis functions. |
naction |
number of possible actions that are considered. |
fcts |
matrix of size second element: which knot ( third element: second covariate involved ( fourth element: knot involved (if the third element is fifth, sixth,... element: beta (coefficient) for class one, two, ... |
knots |
a matrix with |
cv |
in how many sets was the data divided for cross-validation.
Only provided if |
loss |
the loss matrix used in cross-validation and test set.
Only provided if |
penalty |
the parameter used in the AIC criterion. Only provided if |
method |
0 = AIC, 1 = test set, 2 = cross-validation. |
ranges |
column |
logl |
matrix with eight or eleven columns. Summarizes fits.
Column one indicates the dimension, column
column two the AIC or loss value, whichever was
used during the model selection
appropriate, column three four and five give the training set log-likelihood,
(misclassification) loss and squared error loss, columns six to
eight give the same information for the test set, column nine (or column
six if |
sample |
sample size. |
tsample |
the sample size of the test set. Only prvided if |
wgtsum |
sum of the case weights. |
covnames |
names of the covariates. |
classnames |
(numerical) names of the classes. |
cv.aic |
the penalty value that was determined optimal by
by cross validation. Only provided if |
cv.tab |
table with three columns. Column one and two indicate the penalty parameter
range for which the cv-loss in column three would be realized.
Only provided if |
seed |
the random seed that was used to determine the order
of the cases for cross-validation.
Only provided if |
delete |
were complete basis functions deleted at once (2), were only individual dimensions deleted (1) or was only the addition stage of the model selection carried out (0)? |
beta |
moments of basisfunctions. Needed for |
select |
if a test set is provided, or if the model is selected using cross validation, was the model selected that minimized (misclassification) loss (0), that maximized test set log-likelihood (1) or that minimized test set squared error loss (2)? |
anova |
matrix with three columns. The first two elements in a line indicate the subspace to which the line refers. The third element indicates the percentage of variance explained by that subspace. |
twgtsum |
sum of the test set case weights (only if |
Charles Kooperberg clk@fredhutch.org.
Charles Kooperberg, Smarajit Bose, and Charles J. Stone (1997). Polychotomous regression. Journal of the American Statistical Association, 92, 117–127.
Charles J. Stone, Mark Hansen, Charles Kooperberg, and Young K. Truong. The use of polynomial splines and their tensor products in extended linear modeling (with discussion) (1997). Annals of Statistics, 25, 1371–1470.
data(iris) fit.iris <- polyclass(iris[,5], iris[,1:4])
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.