DTSg: aggregate.DTSg – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

aggregate.DTSg

Aggregate Values

Description

Applies a temporal aggregation level function to the .dateTime column of a DTSg object and aggregates its values column-wise to the function's temporal aggregation level utilising one or more provided summary functions. Additionally, it sets the object's aggregated field to TRUE. See DTSg for further information.

Usage

## S3 method for class 'DTSg'
aggregate(
  x,
  funby,
  fun,
  ...,
  cols = self$cols(class = "numeric"),
  n = FALSE,
  ignoreDST = FALSE,
  multiplier = 1L,
  funbyHelpers = NULL,
  clone = getOption("DTSgClone")
)

Arguments

`x`	A `DTSg` object (S3 method only).
`funby`	One of the temporal aggregation level functions described in `TALFs` or a user defined temporal aggregation level function. See details for further information.
`fun`	A summary function, (named) `list` of summary functions or (named) character vector specifying summary functions applied column-wise to all the values of the same temporal aggregation level. The return value(s) must be of length one. See details for further information.
`...`	Further arguments passed on to `fun`.
`cols`	A character vector specifying the columns to aggregate.
`n`	A logical specifying if a column named .n giving the number of values per temporal aggregation level is added. See details for further information.
`ignoreDST`	A logical specifying if day saving time is ignored during aggregation. See details for further information.
`multiplier`	A positive integerish value “multiplying” the temporal aggregation level of certain `TALFs`. See details for further information.
`funbyHelpers`	An optional `list` with helper data passed on to `funby`. See details for further information.
`clone`	A logical specifying if the object is modified in place or if a clone (copy) is made beforehand.

Details

User defined temporal aggregation level functions have to return a POSIXct vector of the same length as the time series and accept two arguments: a POSIXct vector as its first and a list with helper data as its second. The default elements of this list are as follows:

timezone: Same as timezone field. See DTSg for further information.
ignoreDST: Same as ignoreDST argument.
periodicity: Same as periodicity field. See DTSg for further information.
na.status: Same as na.status field. See DTSg for further information.
multiplier: Same as multiplier argument.

Any additional element specified in the funbyHelpers argument is appended to the end of the list. In case funbyHelpers contains a ignoreDST or multiplier element, it takes precedence over the respective method argument. A timezone, periodicity or na.status element is rejected.

Some examples for fun are as follows:

mean
list(min = min, max = max)
c(sd = "sd", var = "var")

A list or character vector must have names in case more than one summary function is provided. The method can benefit from data.table's GForce optimisation in case a character vector specifying summary functions is provided.

Depending on the number of columns to aggregate, the .n column contains different counts:

One column: The counts are calculated from the value column without any missing values. This means that missing values are always stripped regardless of the value of a possible na.rm argument.
More than one column: The counts are calculated from the .dateTime column including all missing values.

ignoreDST tells a temporal aggregation level function if it is supposed to ignore day saving time while forming new timestamps. This can be a desired feature for time series strictly following the position of the sun (such as hydrological time series). Doing so ensures that diurnal variations are preserved and all intervals are of “correct” length, however, a possible limitation might be that the day saving time shift is invariably assumed to be exactly one hour long. This feature requires that the periodicity of the time series is recognised and is supported by the following TALFs of the package:

The temporal aggregation level of certain TALFs can be adjusted with the help of the multiplier argument. A multiplier of 10, for example, makes byY_____ aggregate to decades instead of years. Another example is a multiplier of 6 provided to by_m____. The function then aggregates all months of all first and all months of all second half years instead of all months of all years separately. This feature is supported by the following TALFs of the package:

byFasttimeY_____
byFasttimeYm____
byFasttimeYmdH__
byFasttimeYmdHM_
byFasttimeYmdHMS
byFasttime_m____
byFasttime___H__
byFasttime____M_
byFasttime_____S
byY_____
byYm____
byYmdH__ (UTC and equivalent as well as all Etc/GMT only)
byYmdHM_
byYmdHMS
by_m____
by___H__ (UTC and equivalent as well as all Etc/GMT only)
by____M_
by_____S

Value

Returns an aggregated DTSg object.

Examples

# new DTSg object
x <- DTSg$new(values = flow)

# mean yearly river flows
## R6 method
x$aggregate(funby = byY_____, fun = "mean", na.rm = TRUE)

## S3 method
aggregate(x = x, funby = byY_____, fun = "mean", na.rm = TRUE)

# variance and standard deviation of river flows per quarter
## R6 method
x$aggregate(funby = byYQ____, fun = c(var = "var", sd = "sd"), na.rm = TRUE)

## S3 method
aggregate(x = x, funby = byYQ____, fun = c(var = "var", sd = "sd"), na.rm = TRUE)

# mean of river flows of all first and second half years
## R6 method
x$aggregate(funby = by_m____, fun = "mean", na.rm = TRUE, multiplier = 6)

## S3 method
aggregate(x = x, funby = by_m____, fun = "mean", na.rm = TRUE, multiplier = 6)

DTSg

A Class for Working with Time Series Based on 'data.table' and 'R6' with Largely Optional Reference Semantics

v0.7.0

MIT + file LICENSE

Authors

Gerold Hepp [aut, cre]

Initial release