Choose the right execution mode

Note

This documentation is relevant to IBM Quantum® Platform Classic. If you need the newer version, go to the new IBM Quantum Platform documentation.

Utility-scale workloads can take many hours to complete, so it is important that both the classical and quantum resources are scheduled efficiently to streamline the execution. Execution modes provide flexibility in balancing the cost and time tradeoff to use resources optimally for your workloads. There are several aspects to consider when choosing which execution mode to use, such as overall execution time (maximum time to live, or TTL) and time between jobs (interactive TTL).

The benefits of each are summarized below:

Batch
- The entire batch of jobs is scheduled together and there is no additional queuing time for each.
- The jobs' classical computation, such as compilation, is run in parallel. Thus, running multiple jobs in a batch is significantly faster than running them serially.
- There is usually minimal delay between jobs, which can help avoid drift.
- If you partition your workload into multiple jobs and run them in batch mode, you can get results from individual jobs, which makes them more flexible to work with. For example, if a job's results don't meet your expectations, you can cancel the remaining jobs. Also, if one job fails, you can re-submit it instead of re-running the entire workload.
- Is generally less expensive than sessions.
Session
- All the functionality from batch mode (but requiring increased usage; see Workload usage for more details on how usage is calcuated).
- Dedicated and exclusive access to the QPU during the session active window.
- Useful for workloads that don’t have all inputs ready at the outset, for iterative workloads that require classical post-processing before the next one can run, and for experiments that need to run as tightly together as possible.
Job
- Easiest to use when running a small experiment.
- Might run sooner than batch mode.

Recommendations and best practices

Generally, use batch mode unless you have workloads that don’t have all inputs ready at the outset.

Use batch mode to submit multiple primitive jobs simultaneously to shorten processing time.
Use session mode for iterative workloads, or if you need dedicated access to the QPU.
Always use job mode to submit a single primitive request.
Because sessions are generally more expensive, it is recommended that you use batch whenever you don't need the additional benefits from using sessions.

To ensure the most efficient use of the execution modes, the following practices are recommended:

$There is a fixed overhead associated with running a job. In general, if each of your jobs uses less than one minute of QPU time, consider combining several into one larger job (this applies to all execution modes). "QPU time" refers to time spent by the QPU complex to process your job.$
$If each of your jobs consumes more than one minute of QPU time, or if combining jobs is not practical, you can still run multiple jobs in parallel. Every job goes through both classical and quantum processing. While a QPU can process only one job at a time, up to five classical jobs can be processed in parallel. You can take advantage of this by submitting multiple jobs in batch or session execution mode.$

The above are general guidelines, and you should tune your workload to find the optimal ratio, especially when using sessions. For example, if you are using a session to get exclusive access to a backend, consider breaking up large jobs into smaller ones and running them in parallel. This might be more cost-effective because it can reduce wall-clock time.

Running a quantum variational algorithm typically follows this flow:

Prepare the ansatz.
Evaluate the cost function on a QPU.
Take the result from the previous step and run it through a classical optimizer.
Adjust the parameters according to the output of (3), then go back to step (2).

In this case, if you were using job or batch mode, each job generated by step (2) needs to go back through the queue. This drastically increases the experiment length (wall-clock time) due to the queuing time. It could also take longer to converge due to device drift. That is, every iteration is supposed give you a better result, but device drift could make subsequent results worse.

In addition, if you use PEA or PEC, you can learn the noise model once and apply it to subsequent jobs when running in dedicated session. This usually doesn't work with batch or job mode because the noise model could become stale by the time the next job is out of the queue.

To compare the effects of the available error mitigation methods, you might follow this flow:

Construct a circuit and observable.
Submit primitive jobs that use different combinations of error mitigation settings.
Plot the results to observe the effects of the various settings.

In this case, all jobs (which are related but independent) are available at the outset. If you use batch mode, they are scheduled collectively so you only have to wait for them to go through the queue once. Additionally, because the goal is to compare the effects of various error mitigation methods, it's beneficial that they run as closely together as possible. Thus, batch would be a good choice. You could run these jobs in a session, but because sessions are generally more expensive, it is recommended that you use batch whenever you don't need the additional functionality sessions provides.

Recommendations

Was this page helpful?

Report a bug or request content on GitHub.