Function to Compute and Display 2-d Projections for Data Matrices
Function computes and displays 2-d projections of data matrices using either Sammon Non-linear Mapping (default), Multidimensional Scaling, Kruskal's non-metric Multidimensional Scaling (see Venables and Ripley (2001) and Cox and Cox (2001)). The original S-Plus implementation also computed the Minimum Spanning Tree plane projection (Friedman and Rafsky, 1981) as it was available in the Venables and Ripley MASS library for S-Plus. However, the R implememntation of the MASS library does not include Minimum Spanning Trees. In the R implementation, Projection Pursuit has been added using the fastICA procedure of Hyvarinen and Oja (2000). Provision is made to optionally trim individuals (rows) from the input data matrix.
gx.2dproj(xx, proc = "sam", ifilr = FALSE, log = FALSE, rsnd = FALSE, snd = FALSE, range = FALSE, main = "", setseed = FALSE, row.omits = NULL, ...)
xx |
the |
proc |
the 2-d projection procedure required, the default is |
ifilr |
optional isometric log-ratio transformation, the default is no transformation. Recommended for closed compositionl, geochemical, data, when |
log |
optional (natural) log transformation of the data, the default is no log transformation. For a log transformation set |
rsnd |
optional robust normalization of the data with matrix column medians and MADs, the default is no transformation. For a robust normalization set |
snd |
optional normalization of the data with matrix column means and standard deviations, the default is no transformation. For a normalization set |
range |
optional range transformation for the matrix columns, the data values being scaled to between zero and one for, respectively, the minimum and maximum column values. If the data are range transformed, other normalization transformation requests will be ignored. |
main |
an alternative plot title, see Details below. |
row.omits |
permits rows, individuals, to be trimmed from the input matrix, the default |
setseed |
sets the random number seed for |
... |
further arguments to be passed to methods concerning the generated plots. For example, if smaller plotting characters are required, specify |
If main
is undefined a default plot title is generated by appending the input matrix name to the text string "2-d Projection for: "
. If no plot title is required set main = " "
, or if a user defined plot title is required it should be defined in main
, e.g., main = "Plot Title Text"
.
Firstly, it is strongly recommended that if the input data matrix is for data from a closed compositional, geochemical, data matrix that an ilr transform be applied to the data, ifilr = TRUE
. This has the effect of reducing the dimension of the data matrix from p
to (p-1)
. Otherwise, it is desirable to normalize, centre and scale, or undertake a range transformation on the data to ensure the variables have equal ‘weight’ in the projections. If no transformation is requested a warning message is displayed.
The x- and y-axis labels are set appropriately to indicated the type of 2-d projection in the display.
A measure of the ‘stress’ in generating the 2-d projection is estimated and displayed, low stress indicates the projection faithfully represents the relative ‘positions’ of the data in the original p
-space.
The following are returned as an object to be saved for further use:
main |
the plot title. |
input |
a text string containing the name of the |
usage |
The projection option selected, and the values, |
xlab |
the 2-d projection x-axis label. |
ylab |
the 2-d projection y-axis label. |
matnames |
the individal, sample, row identifiers and the names of the input variables. If there are no individual, sample, row identifiers then row numbers are used. If an ilr transform has been used the variable names will be the |
row.numbers |
the row numbers of the individuals, samples, remaining after a trim. If a trim has been executed only the row numbers for the remaining data are stored. |
x |
the n x-axis values for the 2-d projection. |
y |
the n y-axis values for the 2-d projection. |
stress |
the estimated stress of fitting 2-d projection to the |
Any less than detection limit values represented by negative values, or zeros or other numeric codes representing blanks in the data, must be removed prior to executing this function, see ltdl.fix.df
.
Any rows in the data matrix with with NA
s are removed prior to computing the 2-d projection. In the instance of an ilr transformation NA
s have to be removed prior to undertaking the transformation, see remove.na
.
The results of repeated executions of the ‘fastICA’ implementation of Projection Pursuit lead to various mirror images of one another unless set.seed
is used to ensure each execution commences with the same seed.
This function requires that packages MASS (Venables and Ripley) and fastICA (Marchini, Heaton and Ripley) both be available.
Robert G. Garrett
Cox, T.F. and Cox, M.A.A., 2001. Multidimensional Scaling. Chapman and Hall, 308 p.
Friedman, J.H. and Rafsky, L.C., 1981. Graphics for the multivariate two-sample problem. Journal of the American Statistical Association, 76(374):277-291.
Hyvarinen, A. and Oja, E., 2000. Independent Component Analysis: Algorithms and Applications. Neural Networks, 13(4-5):411-430.
Reimann, C., Filzmoser, P., Garrett, R. and Dutter, R., 2008. Statistical Data Analysis Explained: Applied Environmental Statistics with R. John Wiley & Sons, Ltd., 362 p.
Venables, W.N. and Ripley, B.D., 2001. Modern Applied Statistics with S-Plus, 3rd Edition. Springer, 501 p.
## Make test data available data(sind.mat2open) ## Display default, Sammon non-linear map, 2-d projection sind.2dproj <- gx.2dproj(sind.mat2open, ifilr = TRUE) ## Display saved object identifying input matrix row numbers (cex = 0.7), ## and with an alternate main title (cex.main = 0.8) gx.2dproj.plot(sind.2dproj, rowids = TRUE, cex = 0.7, cex.main = 0.8, main = "Howarth & Sinding-Larsen\nStream Sediment ilr Transformed Data") ## Display Kruskal's non-metric multidimensional scaling 2-d projection sind.2dproj <- gx.2dproj(sind.mat2open, proc = "iso", ifilr = TRUE) ## Display saved object identifying input matrix row numbers (cex = 0.7), ## and with an alternate main title (cex.main = 0.8) gx.2dproj.plot(sind.2dproj, rowids = FALSE, cex = 0.7, cex.main = 0.8, main = "Howarth & Sinding-Larsen\nStream Sediment ilr Transformed Data") ## Display default, Sammon non-linear map, 2-d projection, removing the three ## most extreme individuuals sind.2dproj.trim3 <- gx.2dproj(sind.mat2open, ifilr = TRUE, row.omits = c(13,15,16)) ## Clean-up rm(sind.2dproj) rm(sind.2dproj.trim3)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.