Detect and classify compositional outliers.
Detects outliers and classifies them according to different possible explanations.
OutlierClassifier1(X,...) ## S3 method for class 'acomp' OutlierClassifier1(X,...,alpha=0.05, type=c("best","all","type","outlier","grade"),goodOnly=NULL, corrected=TRUE,RedCorrected=FALSE,robust=TRUE)
X |
the dataset as an |
... |
further arguments to MahalanobisDist/gsi.mahOutlier |
alpha |
The confidence level for identifying outliers. |
type |
What type of classification should be used: best: Which
single component would best explain the outlier. all: Give a binary coding
specifying all components, which could explain the outlier. type: Is
it a a normal observation |
goodOnly |
an integer vector. Only the specified index of the dataset should be used for estimation of the outlier criteria. This parameter if only a small portion of the dataset is reliable. |
corrected |
logical. Literatur often proposed to compare the Mahalanobis distances with Chisq-Approximations of there distributions. However this does not correct for multiple testing. If corrected is true a correction for multiple testing is used. In any case we do not use the chisq-approximation, but a simulation based procedure to compute confidence bounds. |
RedCorrected |
logical. If an outlier is detected we can try to find out wether a single component would be sufficient to drop the outlier under the outlier detection limit. Since in this second case we only check a few outliers no second correction step applies as long as the number of outliers is not very high. |
robust |
A robustness description as define in
|
See outliersInCompositions for a comprehensive introduction into the outlier treatment in compositions.
See ClusterFinder1
for an alternative method to classify
observations in the context of outliers.
A factor classifying the observations in the dataset as "ok" or some type of outlier.
The package robustbase is required for using the robust estimations.
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
## Not run: tmp<-set.seed(1400) A <- matrix(c(0.1,0.2,0.3,0.1),nrow=2) Mvar <- 0.1*ilrvar2clr(A%*%t(A)) Mcenter <- acomp(c(1,2,1)) data(SimulatedAmounts) datas <- list(data1=sa.outliers1,data2=sa.outliers2,data3=sa.outliers3, data4=sa.outliers4,data5=sa.outliers5,data6=sa.outliers6) opar<-par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) tmp<-mapply(function(x,y) { outlierplot(x,type="scatter",class.type="grade"); title(y) },datas,names(datas)) par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) tmp<-mapply(function(x,y) { myCls2 <- OutlierClassifier1(x,alpha=0.05,type="all",corrected=TRUE) outlierplot(x,type="scatter",classifier=OutlierClassifier1,class.type="best", Legend=legend(1,1,levels(myCls),xjust=1,col=colcode,pch=pchcode), pch=as.numeric(myCls2)); legend(0,1,legend=levels(myCls2),pch=1:length(levels(myCls2))) title(y) },datas,names(datas)) par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) for( i in 1:length(datas) ) outlierplot(datas[[i]],type="ecdf",main=names(datas)[i]) par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) for( i in 1:length(datas) ) outlierplot(datas[[i]],type="portion",main=names(datas)[i]) par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1)) for( i in 1:length(datas) ) outlierplot(datas[[i]],type="nout",main=names(datas)[i]) par(opar) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.