You are viewing the API reference for an old version of Qiskit SDK. Switch to latest version

class GradientDescent(maxiter=100, learning_rate=0.01, tol=1e-07, callback=None, perturbation=None)

GitHub(opens in a new tab)

For a function $f$ and an initial point $\vec\theta_0$, the standard (or “vanilla”) gradient descent method is an iterative scheme to find the minimum $\vec\theta^*$ of $f$ by updating the parameters in the direction of the negative gradient of $f$

$\vec\theta_{n+1} = \vec\theta_{n} - \eta_n \vec\nabla f(\vec\theta_{n}),$

for a small learning rate $\eta_n > 0$.

You can either provide the analytic gradient $\vec\nabla f$ as jac in the minimize() method, or, if you do not provide it, use a finite difference approximation of the gradient. To adapt the size of the perturbation in the finite difference gradients, set the perturbation property in the initializer.

This optimizer supports a callback function. If provided in the initializer, the optimizer will call the callback in each iteration with the following information in this order: current number of function values, current parameters, current function value, norm of current gradient.

Examples

A minimum example that will use finite difference gradients with a default perturbation of 0.01 and a default learning rate of 0.01.

from qiskit.algorithms.optimizers import GradientDescent

def f(x):
return (np.linalg.norm(x) - 1) ** 2

initial_point = np.array([1, 0.5, -0.2])

result = optimizer.minimize(fun=fun, x0=initial_point)

print(f"Found minimum {result.x} at a value"
"of {result.fun} using {result.nfev} evaluations.")

An example where the learning rate is an iterator and we supply the analytic gradient. Note how much faster this convergences (i.e. less nfev) compared to the previous example.

from qiskit.algorithms.optimizers import GradientDescent

def learning_rate():
power = 0.6
constant_coeff = 0.1
def powerlaw():
n = 0
while True:
yield constant_coeff * (n ** power)
n += 1

return powerlaw()

def f(x):
return (np.linalg.norm(x) - 1) ** 2

return 2 * (np.linalg.norm(x) - 1) * x / np.linalg.norm(x)

initial_point = np.array([1, 0.5, -0.2])

print(f"Found minimum {result.x} at a value"
"of {result.fun} using {result.nfev} evaluations.")

An other example where the evaluation of the function has a chance of failing. The user, with specific knowledge about his function can catch this errors and handle them before passing the result to the optimizer.

import random
import numpy as np

def objective(x):
if random.choice([True, False]):
return None
else:
return (np.linalg.norm(x) - 1) ** 2

if random.choice([True, False]):
return None
else:
return 2 * (np.linalg.norm(x) - 1) * x / np.linalg.norm(x)

initial_point = np.random.normal(0, 1, size=(100,))

while optimizer.continue_condition():

optimizer.state.njev += 1

optmizer.state.nit += 1

result = optimizer.create_result()

Users that aren’t dealing with complicated functions and who are more familiar with step by step optimization algorithms can use the step() method which wraps the ask() and tell() methods. In the same spirit the method minimize() will optimize the function and return the result.

To see other libraries that use this interface one can visit: https://optuna.readthedocs.io/en/stable/tutorial/20_recipes/009_ask_and_tell.html(opens in a new tab)

Parameters

• maxiter (int) – The maximum number of iterations.
• learning_rate (Union[float, List[float], ndarray, Callable[[], Iterator]]) – A constant, list, array or factory of generators yielding learning rates for the parameter updates. See the docstring for an example.
• tol (float) – If the norm of the parameter update is smaller than this threshold, the optimizer has converged.
• perturbation (Optional[float]) – If no gradient is passed to minimize() the gradient is approximated with a forward finite difference scheme with perturbation perturbation in both directions (defaults to 1e-2 if required). Ignored when we have an explicit function for the gradient.

Raises

ValueError – If learning_rate is an array and its lenght is less than maxiter.

## Methods

GradientDescent.ask()

Returns an object with the data needed to evaluate the gradient.

If this object contains a gradient function the gradient can be evaluated directly. Otherwise approximate it with a finite difference scheme.

Return type

AskData

### continue_condition

GradientDescent.continue_condition()

Condition that indicates the optimization process should come to an end.

When the stepsize is smaller than the tolerance, the optimization process is considered finished.

Return type

bool

Returns

True if the optimization process should continue, False otherwise.

### create_result

GradientDescent.create_result()

Creates a result of the optimization process.

This result contains the best point, the best function value, the number of function/gradient evaluations and the number of iterations.

Return type

OptimizerResult

Returns

The result of the optimization process.

### evaluate

GradientDescent.evaluate(ask_data)

It does so either by evaluating an analytic gradient or by approximating it with a finite difference scheme. It will either add 1 to the number of gradient evaluations or add N+1 to the number of function evaluations (Where N is the dimension of the gradient).

Parameters

ask_data (AskData) – It contains the point where the gradient is to be evaluated and the gradient function or, in its absence, the objective function to perform a finite difference approximation.

Return type

TellData

Returns

The data containing the gradient evaluation.

### get_support_level

GradientDescent.get_support_level()

Get the support level dictionary.

static GradientDescent.gradient_num_diff(x_center, f, epsilon, max_evals_grouped=None)

We compute the gradient with the numeric differentiation in the parallel way, around the point x_center.

Parameters

• x_center (ndarray) – point around which we compute the gradient
• f (func) – the function of which the gradient is to be computed.
• epsilon (float) – the epsilon used in the numeric differentiation.
• max_evals_grouped (int) – max evals grouped, defaults to 1 (i.e. no batching).

Returns

Return type

### minimize

GradientDescent.minimize(fun, x0, jac=None, bounds=None)

Minimizes the function.

For well behaved functions the user can call this method to minimize a function. If the user wants more control on how to evaluate the function a custom loop can be created using ask() and tell() and evaluating the function manually.

Parameters

• fun (Callable[[Union[float, ndarray]], float]) – Function to minimize.
• x0 (Union[float, ndarray]) – Initial point.
• jac (Optional[Callable[[Union[float, ndarray]], Union[float, ndarray]]]) – Function to compute the gradient.
• bounds (Optional[List[Tuple[float, float]]]) – Bounds of the search space.

Return type

OptimizerResult

Returns

Object containing the result of the optimization.

GradientDescent.print_options()

Print algorithm-specific options.

### set_max_evals_grouped

GradientDescent.set_max_evals_grouped(limit)

Set max evals grouped

### set_options

GradientDescent.set_options(**kwargs)

Sets or updates values in the options dictionary.

The options dictionary may be used internally by a given optimizer to pass additional optional values for the underlying optimizer/optimization function used. The options dictionary may be initially populated with a set of key/values when the given optimizer is constructed.

Parameters

kwargs (dict) – options, given as name=value.

### start

GradientDescent.start(fun, x0, jac=None, bounds=None)

Populates the state of the optimizer with the data provided and sets all the counters to 0.

Parameters

• fun (Callable[[Union[float, ndarray]], float]) – Function to minimize.
• x0 (Union[float, ndarray]) – Initial point.
• jac (Optional[Callable[[Union[float, ndarray]], Union[float, ndarray]]]) – Function to compute the gradient.
• bounds (Optional[List[Tuple[float, float]]]) – Bounds of the search space.

Return type

None

### step

GradientDescent.step()

Performs one step in the optimization process.

This method composes ask(), evaluate(), and tell() to make a “step” in the optimization process.

Return type

None

### tell

GradientDescent.tell(ask_data, tell_data)

Updates x by an ammount proportional to the learning rate and value of the gradient at that point.

Parameters

Raises

ValueError – If the gradient passed doesn’t have the right dimension.

Return type

None

### wrap_function

static GradientDescent.wrap_function(function, args)

Wrap the function to implicitly inject the args at the call of the function.

Parameters

• function (func) – the target function
• args (tuple) – the args to be injected

Returns

wrapper

Return type

function_wrapper

## Attributes

### bounds_support_level

Returns bounds support level

### initial_point_support_level

Returns initial point support level

### is_bounds_ignored

Returns is bounds ignored

### is_bounds_required

Returns is bounds required

### is_bounds_supported

Returns is bounds supported

### is_initial_point_ignored

Returns is initial point ignored

### is_initial_point_required

Returns is initial point required

### is_initial_point_supported

Returns is initial point supported

### perturbation

Returns the perturbation.

This is the perturbation used in the finite difference gradient approximation.

Return type

Optional[float]

Return setting

### settings

Return type

Dict[str, Any]

### state

Return the current state of the optimizer.

Return type

GradientDescentState

### tol

Returns the tolerance of the optimizer.

Any step with smaller stepsize than this value will stop the optimization.

Return type

float