It looks like there is incorrect limit on the model context length. The fp16 like the original one have 131072 length. Updating this value resolved errors while processing longer prompts.

by dtrawins - opened Sep 3, 2025

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-1

dtrawins

OpenVINO Toolkit org Sep 3, 2025

No description provided.

It looks like there is incorrect limit on the model context length. The fp16 like the original one have 131072 length. Updating this value resolved errors while processing longer prompts.b419af4e

amokrov

OpenVINO Toolkit org Sep 18, 2025

This is a known issue and a current limitation of the INT4 model. When optimum-intel allows preserving the original max_position_embeddings, we will re-upload the model.

amokrov changed pull request status to closed Sep 18, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment