EnvStats: epareto – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

EnvStats

epareto

Estimate Parameters of a Pareto Distribution

Description

Estimate the location and shape parameters of a Pareto distribution.

Usage

epareto(x, method = "mle", plot.pos.con = 0.375)

Arguments

`x`	numeric vector of observations.
`method`	character string specifying the method of estimation. Possible values are `"mle"` (maximum likelihood; the default), and `"lse"` (least-squares). See the DETAILS section for more information on these estimation methods.
`plot.pos.con`	numeric scalar between 0 and 1 containing the value of the plotting position constant used to construct the values of the empirical cdf. The default value is `plot.pos.con=0.375`. This argument is used only when `method="lse"`.

Details

If x contains any missing (NA), undefined (NaN) or infinite (Inf, -Inf) values, they will be removed prior to performing the estimation.

Let \underline{x} = (x_1, x_2, …, x_n) be a vector of n observations from a Pareto distribution with parameters location=η and shape=θ.

Maximum Likelihood Estimatation (method="mle")
The maximum likelihood estimators (mle's) of η and θ are given by (Evans et al., 1993; p.122; Johnson et al., 1994, p.581):

\hat{η}_{mle} = x_{(1)} \;\;\;\; (1)

\hat{θ}_{mle} = n [∑_{i=1}^n log(\frac{x_i}{\hat{η}_{mle}}) ]^{-1} \;\;\;\; (2)

where x_(1) denotes the first order statistic (i.e., the minimum value).

Least-Squares Estimation (method="lse")
The least-squares estimators (lse's) of η and θ are derived as follows. Let X denote a Pareto random variable with parameters location=η and shape=θ. It can be shown that

log[1 - F(x)] = θ log(η) - θ log(x) \;\;\;\; (3)

where F denotes the cumulative distribution function of X. Set

y_i = log[1 - \hat{F}(x_i)] \;\;\;\; (4)

z_i = log(x_i) \;\;\;\; (5)

where \hat{F}(x) denotes the empirical cumulative distribution function evaluated at x. The least-squares estimates of η and θ are obtained by solving the regression equation

y_i = β_{0} + β_{1} z_i \;\;\;\; (6)

and setting

\hat{θ}_{lse} = -\hat{β}_{1} \;\;\;\; (7)

\hat{η}_{lse} = exp(\frac{\hat{β}_0}{\hat{θ}_{lse}}) \;\;\;\; (8)

(Johnson et al., 1994, p.580).

Value

a list of class "estimate" containing the estimated parameters and other information.
See estimate.object for details.

Note

The Pareto distribution is named after Vilfredo Pareto (1848-1923), a professor of economics. It is derived from Pareto's law, which states that the number of persons N having income ≥ x is given by:

N = A x^{-θ}

where θ denotes Pareto's constant and is the shape parameter for the probability distribution.

The Pareto distribution takes values on the positive real line. All values must be larger than the “location” parameter η, which is really a threshold parameter. There are three kinds of Pareto distributions. The one described here is the Pareto distribution of the first kind. Stable Pareto distributions have 0 < θ < 2. Note that the r'th moment only exists if r < θ.

The Pareto distribution is related to the exponential distribution and logistic distribution as follows. Let X denote a Pareto random variable with location=η and shape=θ. Then log(X/η) has an exponential distribution with parameter rate=θ, and -log\{ [(X/η)^θ] - 1 \} has a logistic distribution with parameters location=0 and scale=1.

The Pareto distribution has a very long right-hand tail. It is often applied in the study of socioeconomic data, including the distribution of income, firm size, population, and stock price fluctuations.

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.

Johnson, N. L., S. Kotz, and N. Balakrishnan. (1994). Continuous Univariate Distributions, Volume 1. Second Edition. John Wiley and Sons, New York.

Examples

# Generate 30 observations from a Pareto distribution with parameters 
  # location=1 and shape=1 then estimate the parameters. 
  # (Note: the call to set.seed simply allows you to reproduce this example.)

  set.seed(250) 
  dat <- rpareto(30, location = 1, shape = 1) 
  epareto(dat) 

  #Results of Distribution Parameter Estimation
  #--------------------------------------------
  #
  #Assumed Distribution:            Pareto
  #
  #Estimated Parameter(s):          location = 1.009046
  #                                 shape    = 1.079850
  #
  #Estimation Method:               mle
  #
  #Data:                            dat
  #
  #Sample Size:                     30

  #----------

  # Compare the results of using the least-squares estimators:

  epareto(dat, method="lse")$parameters 
  #location    shape 
  #1.085924 1.144180

  #----------

  # Clean up
  #---------

  rm(dat)

EnvStats

Package for Environmental Statistics, Including US EPA Guidance

v2.4.0

GPL (>= 3)

Authors

Steven P. Millard [aut], Alexander Kowarik [ctb, cre] (<https://orcid.org/0000-0001-8598-4130>)

Initial release

2020-10-20

epareto

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

EnvStats

We don't support your browser anymore