An EDA Graphical and Statistical Summary
Plots a three panel graphical distributional summary for a data set, comprising a histogram and a cumulative normal percentage probability (CPP) plot, together with a table of selected percentiles of the data and summary statistics between them. Optionally the EDA graphics may be plotted with base 10 logarithmic scaling.
inset(xx, xlab = deparse(substitute(xx)), log = FALSE, xlim = NULL, nclass = NULL, colr = NULL, ifnright = TRUE, table.cex = 0.7, ...)
xx |
name of the variable to be plotted. |
xlab |
by default the character string for |
log |
to display the data with logarithmic (x-axis) scaling, set |
xlim |
default limits of the x-axis are determined in the function. However when used stand-alone the limits may be user-defined by setting |
nclass |
the default procedure for preparing the histogram depends on sample size. Where N <= 500 the Scott (1979) rule is used, and when N > 500 the Freedman-Diaconis (1981) rule; both these rules are resistant to the presence of outliers, and usually provide informative histograms. Alternately, the user may define the histogram binning by setting |
colr |
by default the histogram is infilled in grey, |
ifnright |
controls where the sample size is plotted in the histogram display, by default this in the upper right corner of the plot. If the data distribution is such that the upper left corner would be preferable, set |
table.cex |
controls the size of the text in the central panel of summary statistics table, the default is |
... |
further arguments to be passed to methods. For example, by default individual data points in the CPP plot are marked by a plus sign, |
A histogram is displayed on the left, and a cumulative normal percentage probability plot on the right. Between the two is a table of simple summary statistics, computed by gx.stats
, including minimum, maximum and percentile values, robust estimates of standard deviation, and the mean, standard deviation and coefficient of variation. The plots may be displayed with logarithmic axes, however, the summary statistics are not computed with a logarithmic transform.
Sometimes the table between the two plots may be left and/or right truncated, or incompletely displayed. Reducing the size of the graphics window will lead to a complete display. If this needs to be done the function needs to be repeated to ensure a correctly dimensioned display is saved. Once as a complete saved graphics file the image may be resized in the receiving document.
Any less than detection limit values represented by negative values, or zeros or other numeric codes representing blanks in the data, must be removed prior to executing this function, see ltdl.fix.df
.
Any NAs in the data vector are removed prior to displaying the plot.
If the default selection for xlim
is inappropriate it can be set, e.g., xlim = c(0, 200)
or c(2, 200)
. If the defined limits lie within the observed data range a truncated plot will be displayed. If this occurs the number of data points omitted is displayed below the total number of observations.
The purpose of this function is to prepare publication quality graphics (.emf
or .ps
) files that can be included in reports or used as inset statistical summaries for maps. If a series of these are to be prepared the function inset.exporter
can be used to advantage as it saves a graphics file as part of its procedure.
For summary statistics tables to complement the graphical display see, gx.stats
, gx.summary1
, gx.summary2
and gx.ngr.summary
.
In some R installations the generation of multi-panel displays and the use of function eqscplot from package MASS causes warning messages related to graphics parameters to be displayed on the current device. These may be suppressed by entering options(warn = -1)
on the R command line, or that line may be included in a ‘first’ function prepared by the user that loads the ‘rgr’ package, etc.
Robert G. Garrett
Venables, W.N. and Ripley, B.D., 2001. Modern Applied Statistsis with S-Plus, 3rd Edition, Springer, 501 p. See pp. 119 for a description of histogram bin selection computations.
## Make test data available data(kola.o) attach(kola.o) ## Generates an initial display inset(Cu) ## Provides a more appropriate display for pubication inset(Cu, xlab = "Cu (mg/kg) in <2 mm O-horizon soil", log = TRUE) ## NOTE: The example statistics table may not display correctly ## Detach test data detach(kola.o)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.