Fitting mixture of generalied lambda distribtions to data using maximum likelihood estimation via the EM algorithm
This function will fit mixture of generalised lambda distributions to dataset. It is restricted to two generalised lambda distributions. The method of fitting is maximum likelihood via EM algorithm. It is a two step optimization procedure, each unimodal part of the bimodal distribution is modelled using the maximum likelihood method or the starship method (FMKL GLD only), these initial values are the used to maximise the likelihood for the entire bimodal distribution using the EM algorithm. It fits mixture of the form p*(f1)+(1-p)*(f2) where f1 and f2 are pdfs of the generalised lambda distributions.
fun.auto.bimodal.ml(data, per.of.mix = 0.01, clustering.m = clara, init1.sel = "rprs", init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5), leap1=3, leap2=3,fun1="runif.sobol",fun2="runif.sobol",no=10000, max.it=5000, optim.further="Y")
data |
A numerical vector representing the dataset. |
per.of.mix |
Level of mix between two parts of the distribution, usually 1-2% of cross mix is sufficient. |
clustering.m |
Clustering method used in classifying the dataset into two parts. Valid arguments include clara, fanny and pam from the cluster library. Default is clara. Or a logical vector specifying how data should be split. |
init1.sel |
This can be |
init2.sel |
This can be |
init1 |
Inititial values lambda3 and lambda4 for the first generalised lambda distribution. |
init2 |
Inititial values lambda3 and lambda4 for the second generalised lambda distribution. |
leap1 |
Scrambling (0,1,2,3) for the sobol sequence for the first
distribution fit. See scrambling/leap argument for |
leap2 |
Scrambling (0,1,2,3) for the sobol sequence for the second
distribution fit. See scrambling/leap argument for |
fun1 |
A character string of either |
fun2 |
A character string of either |
no |
Number of initial random values to find the best initial values for optimisation. |
max.it |
Maximum number of iterations for numerical optimisation. |
optim.further |
Whether to optimise the function further using full maximum likelihood method, recommended setting is "Y" |
The initial values that work well for RPRS are c(-1.5,1.5)
and for RMFMKL
are c(-0.25,1.5)
. For scrambling, if 1
, 2
or 3
the
sequence is scrambled otherwise not. If 1
, Owen type type of scrambling
is applied, if 2
, Faure-Tezuka type of scrambling, is applied, and if
3
, both Owen+Faure-Tezuka type of scrambling is applied. The star
method uses the same initial values as rmfmkl
since it uses the FMKL
generalised lambda distribution. Nelder-Simplex algorithm is used in the
numerical optimization. rprs
stands for revised percentile method for
RS generalised lambda distribution and "rmfmkl" stands for revised method of
moment for FMKL generalised lambda distribution. These acronyms represents the
initial optimization algorithm used to get a reasonable set of initial values
for the subsequent optimization procedues.
This function is an improvement from Su (2007) in Journal of Statistical
Software.
par |
The best set of parameters found, the first four corresponds to the first distribution fit, the second four corresponds to the second distribution fit, the last value correspond to p for the first distribution fit. |
value |
The value of -ML for the paramters obtained. |
counts |
A two-element integer vector giving the number of calls to
|
convergence |
|
message |
A character string giving any additional information returned by
the optimizer, or |
If the number of observations is small,
rprs
can sometimes fail as the percentiles may not exist for this data.
Also, if the initial values do not span a valid generalised lambda distribution,
try another set of initial values.
Steve Su
Bratley P. and Fox B.L. (1988) Algorithm 659: Implementing Sobol's quasi random sequence generator, ACM Transactions on Mathematical Software 14, 88-100.
Joe S. and Kuo F.Y. (1998) Remark on Algorithm 659: Implementing Sobol's quasi random Sequence Generator.
Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal *7*, 308-313.
Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.
## Fitting faithful data from the dataset library, with the clara clustering ## regime. The first distribution is RS and the second distribution is fmkl. ## The percentage of data mix is 1%. # fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.01,clustering.m=clara, # init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5), # leap1=3,leap2=3)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.