Qwen3-VL WebAgent BrowserOS

This model is a Qwen3-VL-4B fine-tuned with reinforcement learning for web agent tasks in BrowserOS environments.

Model Details

Base Model: Qwen/Qwen3-VL-4B
Architecture: Qwen3VLForConditionalGeneration
Training Method: Reinforcement Learning (trajectory-based RL)
Precision: bfloat16
Parameters: ~4.4B
Transformers Version: 4.57.3

Usage

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor

model = Qwen3VLForConditionalGeneration.from_pretrained(
    "DavidBShan/Clado-BrowserOS-Action",
    torch_dtype="bfloat16",
    device_map="auto",
)
processor = AutoProcessor.from_pretrained(
    "DavidBShan/Clado-BrowserOS-Action"
)

Training

This model was trained using trajectory-based reinforcement learning on web navigation tasks in a BrowserOS environment. The RL training was performed on top of a Qwen3-VL-4B base model.

Downloads last month: -

Safetensors

Model size

4B params

Tensor type

BF16