Instructions to use pfnet/plamo-3-nict-8b-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use pfnet/plamo-3-nict-8b-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="pfnet/plamo-3-nict-8b-base", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("pfnet/plamo-3-nict-8b-base", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use pfnet/plamo-3-nict-8b-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "pfnet/plamo-3-nict-8b-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pfnet/plamo-3-nict-8b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/pfnet/plamo-3-nict-8b-base

SGLang

How to use pfnet/plamo-3-nict-8b-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "pfnet/plamo-3-nict-8b-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pfnet/plamo-3-nict-8b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "pfnet/plamo-3-nict-8b-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pfnet/plamo-3-nict-8b-base",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use pfnet/plamo-3-nict-8b-base with Docker Model Runner:
```
docker model run hf.co/pfnet/plamo-3-nict-8b-base
```

Update eos_token_id and pad_token_id

by alfredplpl - opened Dec 15, 2025

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

-3

alfredplpl

Dec 15, 2025

eos_token_idは２番であるため、これを正しく設定しないと永遠に生成され続けてしまいます。

Update eos_token_id and pad_token_iddd17feea

sokada

Preferred Networks, Inc. org Dec 16, 2025

PRありがとうございます。https://huggingface.co/alfredplpl/plamo-3-nict-8b-magpie-lora の方も拝見させていただきました。plamo-3をベースにPFNと無関係にinstruction tuningを実施していただけたのは公開した側として大変嬉しく思います。ありがとうございます。

plamo-3の事前学習の詳細について事後学習するために必要な情報が欠けている状態になってしまっていて申し訳ないです。事後学習は現在実施中であり、今後chat templateを変更する可能性もあったので明記していませんでした。
後日改めて何らかの形で情報をまとめたいと考えていますが、一旦 #2のPRと合わせてこちらで簡単に現状に基づいて回答させてください。

plamo-3-nict-8b-baseの事前学習において、<|plamo:bos|>をデータサンプルのconcat時に使いました。<|plamo:eos|>はtokenizerのvocabに含まれていますが、含まれているだけで事前学習時には使われていません。このため、こちらのPRに含まれているgeneration_config.jsonの修正（eos_token_id=2にする変更）は適切ではなく、元のeos_token_id=1が適切です。

eos_token_idのリストに含まれる16ですが、こちらは<|plamo:tag|>です。plamo-3-nict-8b-baseの事前学習においては後続の事後学習を考慮して、chat形式と同様の形式のデータも学習データに含めていました。chat形式と同様の形式のデータでは<|plamo:tag|>をmessageの区切りとして使いました。
tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)としていただくと具体的な文字列を得ることができます。

<|plamo:bos|><|plamo:tag|>user<|plamo:msg|>こちらはユーザーメッセージです<|plamo:tag|>assistant<|plamo:msg|>こちらはアシスタントメッセージです<|plamo:tag|>

特に、message間で改行は挿入されないことに注意してください。

以上から、#2に含まれる修正も適切ではなく、元のchat_template.jinjaのままが適切と考えています。
もちろん、plamo-3-nict-8b-magpie-loraですでに実施していただいたように、新しくchat形式を定義して、それに沿ってinstruction tuningすることもできます。ただ、事前学習で定義していたchat形式はchat_template.jinjaにある通りなので、そちらをベースとして使っていただいたほうが学習はスムーズな可能性が高いです。

alfredplpl

Dec 16, 2025

ありがとうございます。丁寧な回答で理解できました。御社のモデルをファインチューニングする際にはテンプレート内容を考えてみます。
重ねてお礼いたします。これにてクローズとさせていただきます。

alfredplpl changed pull request status to closed Dec 16, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment