# SPSA

*class *`qiskit.algorithms.optimizers.SPSA(maxiter=100, blocking=False, allowed_increase=None, trust_region=False, learning_rate=None, perturbation=None, last_avg=1, resamplings=1, perturbation_dims=None, second_order=False, regularization=None, hessian_delay=0, lse_solver=None, initial_hessian=None, callback=None, termination_checker=None)`

Bases: `Optimizer`

Simultaneous Perturbation Stochastic Approximation (SPSA) optimizer.

SPSA [1] is an gradient descent method for optimizing systems with multiple unknown parameters. As an optimization method, it is appropriately suited to large-scale population models, adaptive modeling, and simulation optimization.

Many examples are presented at the SPSA Web site(opens in a new tab).

The main feature of SPSA is the stochastic gradient approximation, which requires only two measurements of the objective function, regardless of the dimension of the optimization problem.

Additionally to standard, first-order SPSA, where only gradient information is used, this implementation also allows second-order SPSA (2-SPSA) [2]. In 2-SPSA we also estimate the Hessian of the loss with a stochastic approximation and multiply the gradient with the inverse Hessian to take local curvature into account and improve convergence. Notably this Hessian estimate requires only a constant number of function evaluations unlike an exact evaluation of the Hessian, which scales quadratically in the number of function evaluations.

SPSA can be used in the presence of noise, and it is therefore indicated in situations involving measurement uncertainty on a quantum computation when finding a minimum. If you are executing a variational algorithm using an OpenQASM simulator or a real device, SPSA would be the most recommended choice among the optimizers provided here.

The optimization process can includes a calibration phase if neither the `learning_rate`

nor `perturbation`

is provided, which requires additional functional evaluations. (Note that either both or none must be set.) For further details on the automatic calibration, please refer to the supplementary information section IV. of [3].

This component has some function that is normally random. If you want to reproduce behavior then you should set the random number generator seed in the algorithm_globals (`qiskit.utils.algorithm_globals.random_seed = seed`

).

**Examples**

This short example runs SPSA for the ground state calculation of the `Z ^ Z`

observable where the ansatz is a `PauliTwoDesign`

circuit.

```
import numpy as np
from qiskit.algorithms.optimizers import SPSA
from qiskit.circuit.library import PauliTwoDesign
from qiskit.opflow import Z, StateFn
ansatz = PauliTwoDesign(2, reps=1, seed=2)
observable = Z ^ Z
initial_point = np.random.random(ansatz.num_parameters)
def loss(x):
bound = ansatz.assign_parameters(x)
return np.real((StateFn(observable, is_measurement=True) @ StateFn(bound)).eval())
spsa = SPSA(maxiter=300)
result = spsa.optimize(ansatz.num_parameters, loss, initial_point=initial_point)
```

To use the Hessian information, i.e. 2-SPSA, you can add second_order=True to the initializer of the SPSA class, the rest of the code remains the same.

```
two_spsa = SPSA(maxiter=300, second_order=True)
result = two_spsa.optimize(ansatz.num_parameters, loss, initial_point=initial_point)
```

The termination_checker can be used to implement a custom termination criterion.

```
import numpy as np
from qiskit.algorithms.optimizers import SPSA
def objective(x):
return np.linalg.norm(x) + .04*np.random.rand(1)
class TerminationChecker:
def __init__(self, N : int):
self.N = N
self.values = []
def __call__(self, nfev, parameters, value, stepsize, accepted) -> bool:
self.values.append(value)
if len(self.values) > self.N:
last_values = self.values[-self.N:]
pp = np.polyfit(range(self.N), last_values, 1)
slope = pp[0] / self.N
if slope > 0:
return True
return False
spsa = SPSA(maxiter=200, termination_checker=TerminationChecker(10))
parameters, value, niter = spsa.optimize(2, objective, initial_point=[0.5, 0.5])
print(f'SPSA completed after {niter} iterations')
```

**References**

[1]: J. C. Spall (1998). An Overview of the Simultaneous Perturbation Method for Efficient Optimization, Johns Hopkins APL Technical Digest, 19(4), 482–492. Online at jhuapl.edu.(opens in a new tab)

[2]: J. C. Spall (1997). Accelerated second-order stochastic optimization using only function measurements, Proceedings of the 36th IEEE Conference on Decision and Control, 1417-1424 vol.2. Online at IEEE.org.(opens in a new tab)

[3]: A. Kandala et al. (2017). Hardware-efficient Variational Quantum Eigensolver for Small Molecules and Quantum Magnets. Nature 549, pages242–246(2017). arXiv:1704.05018v2(opens in a new tab)

**Parameters**

**maxiter**(*int*(opens in a new tab)) – The maximum number of iterations. Note that this is not the maximal number of function evaluations.**blocking**(*bool*(opens in a new tab)) – If True, only accepts updates that improve the loss (up to some allowed increase, see next argument).**allowed_increase**(*float*(opens in a new tab)*| None*) – If`blocking`

is`True`

, this argument determines by how much the loss can increase with the proposed parameters and still be accepted. If`None`

, the allowed increases is calibrated automatically to be twice the approximated standard deviation of the loss function.**trust_region**(*bool*(opens in a new tab)) – If`True`

, restricts the norm of the update step to be $\leq 1$.**learning_rate**(*float*(opens in a new tab)*| np.ndarray | Callable[[], Iterator] | None*) – The update step is the learning rate is multiplied with the gradient. If the learning rate is a float, it remains constant over the course of the optimization. If a NumPy array, the $i$-th element is the learning rate for the $i$-th iteration. It can also be a callable returning an iterator which yields the learning rates for each optimization step. If`learning_rate`

is set`perturbation`

must also be provided.**perturbation**(*float*(opens in a new tab)*| np.ndarray | Callable[[], Iterator] | None*) – Specifies the magnitude of the perturbation for the finite difference approximation of the gradients. See`learning_rate`

for the supported types. If`perturbation`

is set`learning_rate`

must also be provided.**last_avg**(*int*(opens in a new tab)) – Return the average of the`last_avg`

parameters instead of just the last parameter values.**resamplings**(*int*(opens in a new tab)*|**dict*(opens in a new tab)*[**int*(opens in a new tab)*,**int*(opens in a new tab)*]*) – The number of times the gradient (and Hessian) is sampled using a random direction to construct a gradient estimate. Per default the gradient is estimated using only one random direction. If an integer, all iterations use the same number of resamplings. If a dictionary, this is interpreted as`{iteration: number of resamplings per iteration}`

.**perturbation_dims**(*int*(opens in a new tab)*| None*) – The number of perturbed dimensions. Per default, all dimensions are perturbed, but a smaller, fixed number can be perturbed. If set, the perturbed dimensions are chosen uniformly at random.**second_order**(*bool*(opens in a new tab)) – If True, use 2-SPSA instead of SPSA. In 2-SPSA, the Hessian is estimated additionally to the gradient, and the gradient is preconditioned with the inverse of the Hessian to improve convergence.**regularization**(*float*(opens in a new tab)*| None*) – To ensure the preconditioner is symmetric and positive definite, the identity times a small coefficient is added to it. This generator yields that coefficient.**hessian_delay**(*int*(opens in a new tab)) – Start multiplying the gradient with the inverse Hessian only after a certain number of iterations. The Hessian is still evaluated and therefore this argument can be useful to first get a stable average over the last iterations before using it as preconditioner.**lse_solver**(*Callable[[np.ndarray, np.ndarray], np.ndarray] | None*) – The method to solve for the inverse of the Hessian. Per default an exact LSE solver is used, but can e.g. be overwritten by a minimization routine.**initial_hessian**(*np.ndarray | None*) – The initial guess for the Hessian. By default the identity matrix is used.**callback**(*CALLBACK | None*) – A callback function passed information in each iteration step. The information is, in this order: the number of function evaluations, the parameters, the function value, the stepsize, whether the step was accepted.**termination_checker**(*TERMINATIONCHECKER | None*) – A callback function executed at the end of each iteration step. The arguments are, in this order: the parameters, the function value, the number of function evaluations, the stepsize, whether the step was accepted. If the callback returns True, the optimization is terminated. To prevent additional evaluations of the objective method, if the objective has not yet been evaluated, the objective is estimated by taking the mean of the objective evaluations used in the estimate of the gradient.

**Raises**

**ValueError**(opens in a new tab) – If `learning_rate`

or `perturbation`

is an array with less elements than the number of iterations.

## Attributes

### bounds_support_level

Returns bounds support level

### gradient_support_level

Returns gradient support level

### initial_point_support_level

Returns initial point support level

### is_bounds_ignored

Returns is bounds ignored

### is_bounds_required

Returns is bounds required

### is_bounds_supported

Returns is bounds supported

### is_gradient_ignored

Returns is gradient ignored

### is_gradient_required

Returns is gradient required

### is_gradient_supported

Returns is gradient supported

### is_initial_point_ignored

Returns is initial point ignored

### is_initial_point_required

Returns is initial point required

### is_initial_point_supported

Returns is initial point supported

### setting

Return setting

### settings

## Methods

### calibrate

*static *`calibrate(loss, initial_point, c=0.2, stability_constant=0, target_magnitude=None, alpha=0.602, gamma=0.101, modelspace=False, max_evals_grouped=1)`

Calibrate SPSA parameters with a powerseries as learning rate and perturbation coeffs.

The powerseries are:

$a_k = \frac{a}{(A + k + 1)^\alpha}, c_k = \frac{c}{(k + 1)^\gamma}$**Parameters**

**loss**(*Callable[[np.ndarray],**float*(opens in a new tab)*]*) – The loss function.**initial_point**(*np.ndarray*) – The initial guess of the iteration.**c**(*float*(opens in a new tab)) – The initial perturbation magnitude.**stability_constant**(*float*(opens in a new tab)) – The value of A.**target_magnitude**(*float*(opens in a new tab)*| None*) – The target magnitude for the first update step, defaults to $2\pi / 10$.**alpha**(*float*(opens in a new tab)) – The exponent of the learning rate powerseries.**gamma**(*float*(opens in a new tab)) – The exponent of the perturbation powerseries.**modelspace**(*bool*(opens in a new tab)) – Whether the target magnitude is the difference of parameter values or function values (= model space).**max_evals_grouped**(*int*(opens in a new tab)) – The number of grouped evaluations supported by the loss function. Defaults to 1, i.e. no grouping.

**Returns**

**A tuple of powerseries generators, the first one for the**

learning rate and the second one for the perturbation.

**Return type**

tuple(opens in a new tab)(generator, generator)

### estimate_stddev

*static *`estimate_stddev(loss, initial_point, avg=25, max_evals_grouped=1)`

Estimate the standard deviation of the loss function.

**Return type**

### get_support_level

`get_support_level()`

Get the support level dictionary.

### gradient_num_diff

*static *`gradient_num_diff(x_center, f, epsilon, max_evals_grouped=None)`

We compute the gradient with the numeric differentiation in the parallel way, around the point x_center.

**Parameters**

**x_center**(*ndarray*) – point around which we compute the gradient**f**(*func*) – the function of which the gradient is to be computed.**epsilon**(*float*(opens in a new tab)) – the epsilon used in the numeric differentiation.**max_evals_grouped**(*int*(opens in a new tab)) – max evals grouped, defaults to 1 (i.e. no batching).

**Returns**

the gradient computed

**Return type**

grad

### minimize

`minimize(fun, x0, jac=None, bounds=None)`

Minimize the scalar function.

**Parameters**

**fun**(*Callable[[POINT],**float*(opens in a new tab)*]*) – The scalar function to minimize.**x0**(*POINT*) – The initial point for the minimization.**jac**(*Callable[[POINT], POINT] | None*) – The gradient of the scalar function`fun`

.**bounds**(*list*(opens in a new tab)*[**tuple*(opens in a new tab)*[**float*(opens in a new tab)*,**float*(opens in a new tab)*]] | None*) – Bounds for the variables of`fun`

. This argument might be ignored if the optimizer does not support bounds.

**Returns**

The result of the optimization, containing e.g. the result as attribute `x`

.

**Return type**

### optimize

`optimize(num_vars, objective_function, gradient_function=None, variable_bounds=None, initial_point=None)`

Perform optimization.

The method `qiskit.algorithms.optimizers.spsa.SPSA.optimize()`

is deprecated as of qiskit-terra 0.21.0. It will be removed no earlier than 3 months after the release date. Instead, use `SPSA.minimize`

as a replacement, which supports the same arguments but follows the interface of scipy.optimize and returns a complete result object containing additional information.

**Parameters**

**num_vars**(*int*(opens in a new tab)) – Number of parameters to be optimized.**objective_function**(*callable*) – A function that computes the objective function.**gradient_function**(*callable*) – Not supported for SPSA.**variable_bounds**(*list*(opens in a new tab)*[(**float*(opens in a new tab)*,**float*(opens in a new tab)*)]*) – Not supported for SPSA.**initial_point**(*numpy.ndarray*(opens in a new tab)*[**float*(opens in a new tab)*]*) – Initial point.

**Returns**

**point, value, nfev**

point: is a 1D numpy.ndarray[float] containing the solution value: is a float with the objective function value nfev: number of objective function calls made if available or None

**Return type**

### print_options

`print_options()`

Print algorithm-specific options.

### set_max_evals_grouped

`set_max_evals_grouped(limit)`

Set max evals grouped

### set_options

`set_options(**kwargs)`

Sets or updates values in the options dictionary.

The options dictionary may be used internally by a given optimizer to pass additional optional values for the underlying optimizer/optimization function used. The options dictionary may be initially populated with a set of key/values when the given optimizer is constructed.

**Parameters**

**kwargs** (*dict*(opens in a new tab)) – options, given as name=value.

### wrap_function

*static *`wrap_function(function, args)`

Wrap the function to implicitly inject the args at the call of the function.

**Parameters**

**function**(*func*) – the target function**args**(*tuple*(opens in a new tab)) – the args to be injected

**Returns**

wrapper

**Return type**

function_wrapper