There are several minimization methods available. The best choice of algorithm depends on the problem. All of the algorithms use the value of the function and its gradient at each evaluation point, except for the Simplex algorithm which uses function values only.
This is the Fletcher-Reeves conjugate gradient algorithm. The conjugate
gradient algorithm proceeds as a succession of line minimizations. The
sequence of search directions is used to build up an approximation to the
curvature of the function in the neighborhood of the minimum.
An initial search direction p is chosen using the gradient, and line minimization is carried out in that direction. The accuracy of the line minimization is specified by the parameter tol. The minimum along this line occurs when the function gradient g and the search direction p are orthogonal. The line minimization terminates when dot(p,g) < tol |p| |g|. The search direction is updated using the Fletcher-Reeves formula p' = g' - \beta g where \beta=-|g'|^2/|g|^2, and the line minimization is then repeated for the new search direction.
|This is the Polak-Ribiere conjugate gradient algorithm. It is similar to the Fletcher-Reeves method, differing only in the choice of the coefficient \beta. Both methods work well when the evaluation point is close enough to the minimum of the objective function that it is well approximated by a quadratic hypersurface.|
|This is the vector Broyden-Fletcher-Goldfarb-Shanno (BFGS) conjugate gradient algorithm. It is a quasi-Newton method which builds up an approximation to the second derivatives of the function f using the difference between successive gradient vectors. By combining the first and second derivatives the algorithm is able to take Newton-type steps towards the function minimum, assuming quadratic behavior in that region.|
|The steepest descent algorithm follows the downhill gradient of the function at each step. When a downhill step is successful the step-size is increased by a factor of two. If the downhill step leads to a higher function value then the algorithm backtracks and the step size is decreased using the parameter tol. A suitable value of tol for most applications is 0.1. The steepest descent method is inefficient and is included only for demonstration purposes.|
This is the Simplex algorithm of Nelder and Mead. It constructs
n vectors p_i from the
starting vector x and the vector step_size as follows:
These vectors form the n+1 vertices of a simplex in n dimensions. On each iteration the algorithm tries to improve the parameter vector p_i corresponding to the highest function value by simple geometrical transformations. These are reflection, reflection followed by expansion, contraction and multiple contraction. Using these transformations the simplex moves through the parameter space towards the minimum, where it contracts itself.
After each iteration, the best vertex is returned. Note, that due to the nature of the algorithm not every step improves the current best parameter vector. Usually several iterations are required.
The routine calculates the minimizer specific characteristic size as the
average distance from the geometrical center of the simplex to all its
vertices. This size can be used as a stopping criteria, as the simplex
contracts itself near the minimum. The size is returned by the function