Simulate a Vector of Random Numbers From a Specified Theoretical or Empirical Probability Distribution
Simulate a vector of random numbers from a specified theoretical probability distribution or empirical probability distribution, using either Latin Hypercube sampling or simple random sampling.
simulateVector(n, distribution = "norm", param.list = list(mean = 0, sd = 1), sample.method = "SRS", seed = NULL, sorted = FALSE, left.tail.cutoff = ifelse(is.finite(supp.min), 0, .Machine$double.eps), right.tail.cutoff = ifelse(is.finite(supp.max), 0, .Machine$double.eps))
n |
a positive integer indicating the number of random numbers to generate. |
distribution |
a character string denoting the distribution abbreviation. The default value is
Alternatively, the character string |
param.list |
a list with values for the parameters of the distribution.
The default value is Alternatively, if you specify an empirical distribution by setting |
sample.method |
a character string indicating whether to use simple random sampling |
seed |
integer to supply to the R function |
sorted |
logical scalar indicating whether to return the random numbers in sorted
(ascending) order. The default value is |
left.tail.cutoff |
a scalar between 0 and 1 indicating what proportion of the left-tail of
the probability distribution to omit for Latin Hypercube sampling.
For densities with a finite support minimum (e.g., Lognormal or
Empirical) the default value is |
right.tail.cutoff |
a scalar between 0 and 1 indicating what proportion of the right-tail of
the probability distribution to omit for Latin Hypercube sampling.
For densities with a finite support maximum (e.g., Beta or
Empirical) the default value is |
Latin Hypercube Sampling (sample.method="LHS"
)
When sample.method="LHS"
, the function simulateVector
generates
n
random numbers using Latin Hypercube sampling. The distribution is
divided into n
intervals of equal probability 1/n and simple random
sampling is performed once within each interval; i.e., Latin Hypercube sampling
is simply stratified sampling without replacement, where the strata are defined
by the 0'th, 100(1/n)'th, 100(2/n)'th, ..., and 100'th percentiles of the
distribution.
Latin Hypercube sampling, sometimes abbreviated LHS, is a method of sampling from a probability distribution that ensures all portions of the probability distribution are represented in the sample. It was introduced in the published literature by McKay et al. (1979) to overcome the following problem in Monte Carlo simulation based on simple random sampling (SRS). Suppose we want to generate random numbers from a specified distribution. If we use simple random sampling, there is a low probability of getting very many observations in an area of low probability of the distribution. For example, if we generate n observations from the distribution, the probability that none of these observations falls into the upper 98'th percentile of the distribution is 0.98^n. So, for example, there is a 13% chance that out of 100 random numbers, none will fall at or above the 98'th percentile. If we are interested in reproducing the shape of the distribution, we will need a very large number of observations to ensure that we can adequately characterize the tails of the distribution (Vose, 2008, pp. 59–62).
See Millard (2013) for a visual explanation of Latin Hypercube sampling.
a numeric vector of random numbers from the specified distribution.
Latin Hypercube sampling, sometimes abbreviated LHS, is a method of sampling from a probability distribution that ensures all portions of the probability distribution are represented in the sample. It was introduced in the published literature by McKay et al. (1979). Latin Hypercube sampling is often used in probabilistic risk assessment, specifically for sensitivity and uncertainty analysis (e.g., Iman and Conover, 1980; Iman and Helton, 1988; Iman and Helton, 1991; Vose, 1996).
Steven P. Millard (EnvStats@ProbStatInfo.com)
Iman, R.L., and W.J. Conover. (1980). Small Sample Sensitivity Analysis Techniques for Computer Models, With an Application to Risk Assessment (with Comments). Communications in Statistics–Volume A, Theory and Methods, 9(17), 1749–1874.
Iman, R.L., and J.C. Helton. (1988). An Investigation of Uncertainty and Sensitivity Analysis Techniques for Computer Models. Risk Analysis 8(1), 71–90.
Iman, R.L. and J.C. Helton. (1991). The Repeatability of Uncertainty and Sensitivity Analyses for Complex Probabilistic Risk Assessments. Risk Analysis 11(4), 591–606.
McKay, M.D., R.J. Beckman., and W.J. Conover. (1979). A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code. Technometrics 21(2), 239–245.
Millard, S.P. (2013). EnvStats: an R Package for Environmental Statistics. Springer, New York. https://www.springer.com/book/9781461484554.
Vose, D. (2008). Risk Analysis: A Quantitative Guide. Third Edition. John Wiley & Sons, West Sussex, UK, 752 pp.
# Generate 10 observations from a lognormal distribution with # parameters mean=10 and cv=1 using simple random sampling: simulateVector(10, distribution = "lnormAlt", param.list = list(mean = 10, cv = 1), seed = 47, sort = TRUE) # [1] 2.086931 2.863589 3.112866 5.592502 5.732602 7.160707 # [7] 7.741327 8.251306 12.782493 37.214748 #---------- # Repeat the above example by calling rlnormAlt directly: set.seed(47) sort(rlnormAlt(10, mean = 10, cv = 1)) # [1] 2.086931 2.863589 3.112866 5.592502 5.732602 7.160707 # [7] 7.741327 8.251306 12.782493 37.214748 #---------- # Now generate 10 observations from the same lognormal distribution # but use Latin Hypercube sampling. Note that the largest value # is larger than for simple random sampling: simulateVector(10, distribution = "lnormAlt", param.list = list(mean = 10, cv = 1), seed = 47, sample.method = "LHS", sort = TRUE) # [1] 2.406149 2.848428 4.311175 5.510171 6.467852 8.174608 # [7] 9.506874 12.298185 17.022151 53.552699 #========== # Generate 50 observations from a Pareto distribution with parameters # location=10 and shape=2, then use this resulting vector of # observations as the basis for generating 3 observations from an # empirical distribution using Latin Hypercube sampling: set.seed(321) pareto.rns <- rpareto(50, location = 10, shape = 2) simulateVector(3, distribution = "emp", param.list = list(obs = pareto.rns), sample.method = "LHS") #[1] 11.50685 13.50962 17.47335 #========== # Clean up #--------- rm(pareto.rns)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.