GENetic Optimization Using Derivatives
Genoud
is a function that combines evolutionary search
algorithms with derivative-based (Newton or quasi-Newton) methods to
solve difficult optimization problems. Genoud
may also be
used for optimization problems for which derivatives do not exist.
Genoud
, via the cluster
option, supports the use of
multiple computers, CPUs or cores to perform parallel computations.
genoud(fn, nvars, max=FALSE, pop.size=1000, max.generations=100, wait.generations=10, hard.generation.limit=TRUE, starting.values=NULL, MemoryMatrix=TRUE, Domains=NULL, default.domains=10, solution.tolerance=0.001, gr=NULL, boundary.enforcement=0, lexical=FALSE, gradient.check=TRUE, BFGS=TRUE, data.type.int=FALSE, hessian=FALSE, unif.seed=round(runif(1, 1, 2147483647L)), int.seed=round(runif(1, 1, 2147483647L)),print.level=2, share.type=0, instance.number=0, output.path="stdout", output.append=FALSE, project.path=NULL, P1=50, P2=50, P3=50, P4=50, P5=50, P6=50, P7=50, P8=50, P9=0, P9mix=NULL, BFGSburnin=0, BFGSfn=NULL, BFGShelp=NULL, control=list(), optim.method=ifelse(boundary.enforcement < 2, "BFGS", "L-BFGS-B"), transform=FALSE, debug=FALSE, cluster=FALSE, balance=FALSE, ...)
fn |
The function to be minimized (or maximized if
max= For example, if we wish to maximize the |
nvars |
The number of parameters to be selected for the function to be minimized (or maximized). |
max |
Maximization ( |
pop.size |
Population Size. This is the number of individuals |
max.generations |
Maximum Generations. This is the maximum number of generations that
Although the Please see |
wait.generations |
If there is no improvement in the objective function in this number
of generations, |
hard.generation.limit |
This logical variable determines if the Please see |
starting.values |
A vector or matrix containing parameter values
which |
MemoryMatrix |
This variable controls if Note that when |
Domains |
This is a If the user does not provide any values for Domains, For linear and nonlinear constraints please see the discussion in
the |
default.domains |
If the user does not want to provide a |
solution.tolerance |
This is the tolerance level used by |
gr |
A function to provide the gradient for the |
boundary.enforcement |
This variable determines the degree to which
|
lexical |
This option enables lexical optimization. This is
where there are multiple fit criteria and the parameters are chosen so
as to maximize fitness values in lexical order—i.e., the second fit
criterion is only relevant if the parameters have the same fit for the
first etc. The fit function used with this option should return a
numeric vector of fitness values in lexical order. This option
can take on the values of |
gradient.check |
If this variable is |
BFGS |
This variable denotes whether or not |
data.type.int |
This option sets the data type of the parameters of the function to
be optimized. If the variable is With integer parameters, There is no option to mix integer and floating point parameters. If
one wants to mix the two, it is suggested that the user pick integer type
and in the objective function map a particular integer range into a
floating point number range. For example, tell Alternatively, the user could use floating point numbers and round
the appropriate parameters to the nearest integer inside |
hessian |
When this flag is set to |
unif.seed |
This should not be set. To set the seed, one should use
the |
int.seed |
See |
print.level |
This variable controls the level of printing that |
share.type |
If If the project file contains a smaller population than the current
If the number of variables (see |
instance.number |
This number (starting from 0) denotes the number of recursive
instances of For the R version of |
output.path |
This option is no longer supported. It used to
allow one to redirect the output. Now please use
|
output.append |
This option is no longer supported. Please see
|
project.path |
This is the path of the |
P1 |
This is the cloning operator.
|
P2 |
This is the uniform mutation operator. One parameter of the parent is mutated. Please see the Operators Section for details about operators and how they are weighted. |
P3 |
This is the boundary mutation operator. This operator finds a parent and mutates one of its parameters towards the boundary. Please see the Operators Section for details about operators and how they are weighted. |
P4 |
Non-Uniform Mutation. Please see the Operators Section for details about operators and how they are weighted. |
P5 |
This is the polytope crossover. Please see the Operators Section for details about operators and how they are weighted. |
P6 |
Simple Crossover. Please see the Operators Section for details about operators and how they are weighted. |
P7 |
Whole Non-Uniform Mutation. Please see the Operators Section for details about operators and how they are weighted. |
P8 |
Heuristic Crossover. Please see the Operators Section for details about operators and how they are weighted. |
P9 |
Local-Minimum Crossover: BFGS. This is rather CPU intensive, and should be generally used less than the other operators. Please see the Operators Section for details about operators and how they are weighted. |
P9mix |
This is
a tuning parameter for the |
BFGSburnin |
The number of generations which are run before
the BFGS is first used. Premature use of the BFGS can lead to
convergence to a local optimum instead of the global one. This
option allows the user to control how many generations are run
before the BFGS is started and would logically be a non-negative
integer. However, if |
BFGSfn |
This is a function for the BFGS optimizer to
optimize, if one wants to make it distinct from the |
BFGShelp |
An optional function to pass arguments to
|
control |
A list of control
parameters that is passed to |
optim.method |
A character string among those that are admissible for the
|
transform |
A logical that defaults to There are some issues that should be kept in mind. This option cannot
be used when Finally, if |
debug |
This
variable turns on some debugging information. This variable may
be |
cluster |
This can either be an
object of the 'cluster' class returned by one of the
|
balance |
This logical flag controls if load balancing is done across the cluster. Load balancing can result in better cluster utilization; however, increased communication can reduce performance. This option is best used if the function being optimized takes at least several minutes to calculate or if the nodes in the cluster vary significantly in their performance. If cluster==FALSE, this option has no effect. |
... |
Further arguments to be passed to |
Genoud
solves problems that are nonlinear or
perhaps even discontinuous in the parameters of the function to be
optimized. When a statistical model's estimating function (for
example, a log-likelihood) is nonlinear in the model's parameters,
the function to be optimized will generally not be globally
concave and may have irregularities such as saddlepoints or
discontinuities. Optimization methods that rely on derivatives of
the objective function may be unable to find any optimum at all.
Multiple local optima may exist, so that there is no guarantee
that a derivative-based method will converge to the global
optimum. On the other hand, algorithms that do not use derivative
information (such as pure genetic algorithms) are for many
problems needlessly poor at local hill climbing. Most statistical
problems are regular in a neighborhood of the solution.
Therefore, for some portion of the search space, derivative
information is useful for such problems. Genoud
also works
well for problems that no derivative information exists. For
additional documentation and examples please see
http://sekhon.berkeley.edu/rgenoud.
genoud
returns a list
with 7 objects. 8 objects are returned if the user has requested
the hessian to be calculated at the solution. Please see the
hessian
option. The returned objects are:
value |
This variable contains the fitness value at the solution. If
|
par |
This vector contains the parameter values found at the solution. |
gradients |
This vector contains the gradients found at the solution. If no
gradients were calculated, they are reported to be |
generations |
This variable contains the number of generations |
peakgeneration |
This variable contains the generation number at which |
pop.size |
This variable contains the population size that |
operators |
This vector reports the actual number of operators (of each type)
|
hessian |
If the user has requested the hessian
matrix to be returned (via the |
Genoud
has nine operators that it uses. The integer values which are
assigned to each of these operators (P1...P9) are
weights.
Genoud
calculates the sum of s =
P1+P2+...+P9. Each operator is
assigned a weight equal to W_n =
s/(P_n). The number of
times an operator is called usually equals c_n = W_n * pop.size.
Operators 6 (Simple Crossover) and 8 (Heuristic
Crossover) require an even number of individuals to work on—i.e.,
they require two parents. Therefore, the pop.size
variable and
the operators sets must be such that these three operators have an
even number of individuals to work with. If this does not occur,
genoud
automatically upwardly adjusts the population size to make this
constraint hold.
Strong uniqueness checks have been built into the operators to help
ensure that the operators produce offspring different from their
parents, but this does not always happen.
Note that genoud
always keeps the best individual each generation.
genoud
's 9 operators are:
Cloning
Uniform Mutation
Boundary Mutation
Non-Uniform Crossover
Polytope Crossover
Simple Crossover
Whole Non-Uniform Mutation
Heuristic Crossover
Local-Minimum Crossover: BFGS
For more information please see Table 1 of the reference article: http://sekhon.berkeley.edu/papers/rgenoudJSS.pdf.
The most important options affecting performance are those determining
population size (pop.size
) and the
number of generations the algorithm runs
(max.generations
, wait.generations
,
hard.generation.limit
and gradient.check
). Search
performance is expected to improve as
the population size and the number of generations the program runs for
increase. These and the other options should be adjusted for the
problem at hand. Please pay particular attention to the search
domains (Domains
and default.domains
). For more information
please see the reference article.
Linear and nonlinear constraints among the parameters can be
introduced by users in their fit function. For example, if
the sum of parameters 1 and 2 must be less than 725, the following can
be placed in the fit function the user is going to have genoud
maximize: if ( (parm1 + parm2) >= 725) { return(-99999999) }
.
In this example, a very bad fit value is returned to genoud
if the
linear constrain is violated. genoud
will then attempt to find
parameter values that satisfy the constraint.
Alternatively, one can use lexical optimization where the first criterion is a
binary variable that equals 1.0 iff (parm1 + parm2) < 725
and the
second criterion is the fit function, which should also be passed to
BFGSfn
. All candidates where (parm1 + parm2) >= 725
will be
ranked below all candidates where (parm1 + parm2) < 725
and within
these two groups, candidates will be ranked by their fit on the second
criterion. The optimal candidate is thus the one with the best fit on the
second criterion among candidates that satisfy this restriction.
Walter R. Mebane, Jr., University of Michigan,
wmebane@umich.edu, http://www-personal.umich.edu/~wmebane/
Jasjeet S. Sekhon, UC Berkeley, sekhon@berkeley.edu, http://sekhon.berkeley.edu/
Mebane, Walter R., Jr. and Jasjeet S. Sekhon. 2011. "Genetic
Optimization Using Derivatives: The rgenoud Package for R."
Journal of Statistical Software, 42(11): 1-26.
http://www.jstatsoft.org/v42/i11/
Sekhon, Jasjeet Singh and Walter R. Mebane, Jr. 1998. “Genetic
Optimization Using Derivatives: Theory and Application to Nonlinear
Models.” Political Analysis, 7: 187-210.
http://sekhon.berkeley.edu/genoud/genoud.pdf
Mebane, Walter R., Jr. and Jasjeet S. Sekhon. 2004. “Robust
Estimation and Outlier Detection for Overdispersed Multinomial Models
of Count Data.” American Journal of Political Science, 48
(April): 391-410. http://sekhon.berkeley.edu/multinom.pdf
Bright, H. and R. Enison. 1979. Quasi-Random Number Sequences from a Long-Period TLP Generator with Remarks on Application to Cryptography. Computing Surveys, 11(4): 357-370.
#maximize the sin function sin1 <- genoud(sin, nvars=1, max=TRUE) #minimize the sin function sin2 <- genoud(sin, nvars=1, max=FALSE) ## Not run: #maximize a univariate normal mixture which looks like a claw claw <- function(xx) { x <- xx[1] y <- (0.46*(dnorm(x,-1.0,2.0/3.0) + dnorm(x,1.0,2.0/3.0)) + (1.0/300.0)*(dnorm(x,-0.5,.01) + dnorm(x,-1.0,.01) + dnorm(x,-1.5,.01)) + (7.0/300.0)*(dnorm(x,0.5,.07) + dnorm(x,1.0,.07) + dnorm(x,1.5,.07))) return(y) } claw1 <- genoud(claw, nvars=1,pop.size=3000,max=TRUE) ## End(Not run) ## Not run: #Plot the previous run xx <- seq(-3,3,.05) plot(xx,lapply(xx,claw),type="l",xlab="Parameter",ylab="Fit", main="GENOUD: Maximize the Claw Density") points(claw1$par,claw1$value,col="red") # Maximize a bivariate normal mixture which looks like a claw. biclaw <- function(xx) { mNd2 <- function(x1, x2, mu1, mu2, sigma1, sigma2, rho) { z1 <- (x1-mu1)/sigma1 z2 <- (x2-mu2)/sigma2 w <- (1.0/(2.0*pi*sigma1*sigma2*sqrt(1-rho*rho))) w <- w*exp(-0.5*(z1*z1 - 2*rho*z1*z2 + z2*z2)/(1-rho*rho)) return(w) } x1 <- xx[1]+1 x2 <- xx[2]+1 y <- (0.5*mNd2(x1,x2,0.0,0.0,1.0,1.0,0.0) + 0.1*(mNd2(x1,x2,-1.0,-1.0,0.1,0.1,0.0) + mNd2(x1,x2,-0.5,-0.5,0.1,0.1,0.0) + mNd2(x1,x2,0.0,0.0,0.1,0.1,0.0) + mNd2(x1,x2,0.5,0.5,0.1,0.1,0.0) + mNd2(x1,x2,1.0,1.0,0.1,0.1,0.0))) return(y) } biclaw1 <- genoud(biclaw, default.domains=20, nvars=2,pop.size=5000,max=TRUE) ## End(Not run) # For more examples see: http://sekhon.berkeley.edu/rgenoud/R.
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.