Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

binning

Construct frequency table from raw data


Description

Given a vector or a matrix x, this function constructs a frequency table associated to appropriate intervals covering the range of x.

Usage

binning(x, y, breaks, nbins)

Arguments

x, y

a vector or a matrix with either one or two columns. If x is a one-dimentional matrix, this is equivalent to a vector.

breaks

either a vector or a matrix with two columns (depending on the dimension of x), assigning the division points of the axis, or the axes in the matrix case. It must not include Inf,-Inf or NAs, and it must span the whole range of the x points. If breaks is not given, it is computed by dividing the range of x into nbins intervals for each of the axes.

nbins

the number of intervals on the x axis (in the vector case), or a vector of two elements with the number of intervals on each axes of x (in the matrix case). If nbins is not given, a value is computed as round(log(length(x))/log(2)+1) or using a similar expression in the matrix case.

Details

This function is called automatically (under the default settings) by some of the functions of the sm library when the sample size is large, to allow handling of datasets of essentially unlimited size. Specifically, it is used by sm.density, sm.regression, sm.ancova, sm.binomial and sm.poisson.

Value

In the vector case, a list is returned containing the following elements: a vector x of the midpoints of the bins excluding those with 0 frequecies, its associated matrix x.freq of frequencies, the coodinateds of the midpoints, the division points, and the complete vector of observed frequencies freq.table (including the 0 frequencies), and the vector breaks of division points. In the matrix case, the returned value is a list with the following elements: a two-dimensional matrix x with the coordinates of the midpoints of the two-dimensional bins excluding those with 0 frequecies, its associated matrix x.freq of frequencies, the coordinates of the midpoints, the matrix breaks of division points, and the observed frequencies freq.table in full tabular form.

References

Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

See Also

Examples

# example of 1-d use
x  <- rnorm(1000)
xb <- binning(x)
xb <- binning(x, breaks=seq(-4,4,by=0.5))
# example of 2-d use
x <- rnorm(1000)
y <- 2*x + 0.5*rnorm(1000)
x <- cbind(x, y)
xb<- binning(x, nbins=12)

sm

Smoothing Methods for Nonparametric Regression and Density Estimation

v2.2-5.6
GPL (>= 2)
Authors
Adrian Bowman and Adelchi Azzalini. Ported to R by B. D. Ripley <ripley@stats.ox.ac.uk> up to version 2.0, version 2.1 by Adrian Bowman and Adelchi Azzalini, version 2.2 by Adrian Bowman.
Initial release
2018-09-27

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.