Add eos_token to tokenizer implicitly to help downstream vllm integrate

#43
by dongwng - opened

Summary

Declare the Parakeet TDT end-of-transcription token in the standard Hugging Face tokenizer and generation config metadata.

This updates:

  • tokenizer_config.json to mark <|endoftext|> as the tokenizer EOS token
  • generation_config.json to set eos_token_id to 3

Rationale

The Parakeet tokenizer already contains <|endoftext|> at token id 3:

tokenizer.convert_tokens_to_ids("<|endoftext|>") == 3
tokenizer.decode([3]) == "<|endoftext|>"

The model also emits token id 3 as the terminal marker during TDT decoding. However, the current repository metadata does not expose that token as EOS:

tokenizer.eos_token_id is None
GenerationConfig.from_pretrained(...).eos_token_id is None

Downstream runtimes that rely on standard Hugging Face metadata therefore cannot discover the model’s stop token from either the tokenizer config or generation_config.json.

Most ASR / speech-generation checkpoints expose this metadata through the standard files. For example, Whisper, Qwen3-ASR, Granite Speech, Fun-ASR, and FireRed ASR/LID publish eos_token_id in generation_config.json, and usually also declare the corresponding tokenizer EOS token.

Adding this metadata makes Parakeet consistent with those checkpoints and avoids downstream framework-specific workarounds.

Compatibility

This does not change tokenizer vocabulary or model weights. It only declares existing semantics:

  • <|endoftext|> already exists in the tokenizer vocabulary
  • token id 3 already decodes to <|endoftext|>
  • token id 3 is already used as the model’s end marker

Expected behavior after the change:

tokenizer.eos_token == "<|endoftext|>"
tokenizer.eos_token_id == 3
GenerationConfig.from_pretrained(...).eos_token_id == 3

Downstream impact

This helps runtimes such as vLLM, Transformers-based serving stacks, and OpenAI-compatible transcription servers stop generation cleanly without adding Parakeet-specific stop-token workarounds. This is one vllm integration PR https://github.com/vllm-project/vllm/pull/41708. When this PR gets merged the vllm integration will be simpler.

dongwng changed pull request status to open
NVIDIA org

LGTM, thanks @dongwng

nithinraok changed pull request status to merged

Sign up or log in to comment