-
Notifications
You must be signed in to change notification settings - Fork 943
Open
Description
Describe the bug
sample
https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-agent-react/llm-agent-react.ipynb
I converted a model for NPU. (Channel-wise quantization)
optimum-cli export openvino -m Qwen/Qwen2.5-3B-Instruct --weight-format int4 --sym --ratio 1.0 --group-size -1 Qwen2.5-3B-Instruct_NPU
Then, tried to run it on NPU but it failed.
Codes that I changed in notebooks/llm-agent-react/llm-agent-react.ipynb:
from pathlib import Path
#llm_model_path = llm_model_id.value.split("/")[-1]
llm_model_path = "Qwen2.5-3B-Instruct_NPU"
#llm_device = device_widget("GPU", exclude=["NPU"])
llm_device = device_widget("NPU")
Expected behavior
NPU should work well like CPU and GPU.
Screenshots
Installation instructions (Please mark the checkbox)
[yes ] I followed the installation guide at https://github.com/openvinotoolkit/openvino_notebooks#-installation-guide to install the notebooks.
Metadata
Metadata
Assignees
Labels
No labels
