Use Qiskit Code Assistant in local mode
Learn how to install, configure, and use the Qiskit Code Assistant model on your local machine.
- Qiskit Code Assistant is in preview release status and is subject to change.
- If you have feedback or want to contact the developer team, use the Qiskit Slack Workspace channel or the related public GitHub repositories.
Download the Qiskit Code Assistant model
The Qiskit Code Assistant model is available in GGUF is a binary format that is designed for fast loading and saving of models, and for ease of reading. and can be downloaded from the Hugging Face Hub in one of two ways.
Download from the Hugging Face website
Follow these steps to download the Qiskit Code Assistant GGUF model from the Hugging Face website:
- Navigate to the
granite-8b-qiskit
model page - Go to the Files and Versions tab and download the GGUF model
Download using the Hugging Face CLI
To download the granite-8b-qiskit
GGUF model using the Hugging Face CLI, follow these steps:
-
Install the Hugging Face CLI
-
Log in to your Hugging Face account
huggingface-cli login
-
Download the
granite-8b-qiskit
GGUF modelhuggingface-cli download <HF REPO NAME> <GGUF PATH> --local-dir <LOCAL PATH>
Use the Qiskit Code Assistant model
There are multiple ways to deploy and interact with the downloaded granite-8b-qiskit
GGUF model. This guide demonstrates using Ollama as follows: either with the Ollama application using the Hugging Face Hub integration or local model, or with the llama-cpp-python
package.
Using the Ollama application
The Ollama application provides a simple solution to run the GGUF models locally. It is easy to use, with a CLI that makes the whole setup process, model management, and interaction fairly straightforward. It’s ideal for quick experimentation and for users that want fewer technical details to handle.
Install Ollama
-
Download the Ollama application
-
Install the downloaded file
-
Launch the installed Ollama application
InfoThe application is running successfully when the Ollama icon appears in the desktop menu bar. You can also verify the service is running by going tohttp://localhost:11434/
. -
Try Ollama in your terminal and start running models. For example:
ollama run granite3-dense:8b
Set up Ollama using the Hugging Face Hub integration
The Ollama/Hugging Face Hub integration provides a way to interact with GGUF models hosted on the Hugging Face Hub without needing to create a new modelfile nor manually downloading the GGUF file. A default template
and params
files are already included for the GGUF model on the Hugging Face Hub.
-
Make sure the Ollama application is running.
-
Go the
granite-8b-qiskit
GGUF model page, choose ollama from the Use this model dropdown. -
From your terminal, run the command:
ollama run hf.co/Qiskit/granite-8b-qiskit-GGUF
Set up Ollama with the Qiskit Code Assistant GGUF model
If you have manually downloaded the GGUF model and you want to experiment with different templates and parameters you can follow these steps to load it into your local Ollama application.
-
Create a
Modelfile
entering the following content and be sure to update<PATH-TO-GGUF-FILE>
to the actual path of your downloaded model.FROM <PATH-TO-GGUF-FILE> TEMPLATE """{{ if .System }} System: {{ .System }} {{ end }}{{ if .Prompt }}Question: {{ .Prompt }} {{ end }}Answer: ```python{{ .Response }} """ PARAMETER stop "Question:" PARAMETER stop "Answer:" PARAMETER stop "System:" PARAMETER stop "```" PARAMETER temperature 0 PARAMETER top_k 1
-
Run the following command to create a custom model instance based on the
Modelfile
.ollama create granite-8b-qiskit -f ./path-to-model-file
NoteThis process may take some time for Ollama to read the model file, initialize the model instance, and configure it according to the specifications provided.
Run the Qiskit Code Assistant model in Ollama
After the granite-8b-qiskit
GGUF model has been set up in Ollama, run the following command to launch the model and interact with it in the terminal (in chat mode).
ollama run granite-8b-qiskit
Some useful commands:
ollama list
- List models on your computerollama rm granite-8b-qiskit
- Remove/delete the modelollama show granite-8b-qiskit
- Show model informationollama stop granite-8b-qiskit
- Stop a model that is currently runningollama ps
- List which models are currently loaded
Using the llama-cpp-python
package
An alternative to the Ollama application is the llama-cpp-python
package, which is a Python binding for llama.cpp
. It gives you more control and flexibility to run the GGUF model locally, and is ideal for users who wish to integrate the local model in their workflows and Python applications.
- Install
llama-cpp-python
- Interact with the model from within your application using
llama_cpp
. For example:
from llama_cpp import Llama
model_path = <PATH-TO-GGUF-FILE>
model = Llama(
model_path,
seed=17,
n_ctx=10000,
n_gpu_layers=37, # to offload in gpu, but put 0 if all in cpu
)
input = 'Generate a quantum circuit with 2 qubits'
raw_pred = model(input)["choices"][0]["text"]
You can also add text generation parameters to the model to customize the inference:
generation_kwargs = {
"max_tokens": 512,
"echo": False, # Echo the prompt in the output
"top_k": 1
}
raw_pred = model(input, **generation_kwargs)["choices"][0]["text"]
Use the Qiskit Code Assistant extensions
Use the VS Code extension and JupyterLab extension for the Qiskit Code Assistant to prompt the locally deployed granite-8b-qiskit
GGUF model. Once you have the Ollama application set up with the model, you can configure the extensions to connect to the local service.
Connect with the Qiskit Code Assistant VS Code extension
With the Qiskit Code Assistant VS Code extension, you can interact with the model and perform code completion while writing your code. This can work well for users looking for assistance writing Qiskit code for their Python applications.
- Install the Qiskit Code Assistant VS Code extension.
- In VS Code, go to the User Settings and set the Qiskit Code Assistant: Url to the URL of your local Ollama deployment (for example,
http://localhost:11434
). - Reload VS Code by going to View > Command Palette... and selecting Developer: Reload Window.
The granite-8b-qiskit
configured in Ollama should appear in the status bar and is then ready to use.
Connect with the Qiskit Code Assistant JupyterLab extension
With the Qiskit Code Assistant JupyterLab extension, you can interact with the model and perform code completion directly in your Jupyter Notebook. Users who predominantly work with Jupyter Notebooks can take advantage of this extension to further enhance their experience writing Qiskit code.
- Install the Qiskit Code Assistant JupyterLab extension.
- In JupyterLab, go to the Settings Editor and set the Qiskit Code Assistant Service API to the URL of your local Ollama deployment (for example,
http://localhost:11434
).
The granite-8b-qiskit
configured in Ollama should appear in the status bar and is then ready to use.