Qwen3-VL WebAgent BrowserOS

This model is a Qwen3-VL-4B fine-tuned with reinforcement learning for web agent tasks in BrowserOS environments.

Model Details

  • Base Model: Qwen/Qwen3-VL-4B
  • Architecture: Qwen3VLForConditionalGeneration
  • Training Method: Reinforcement Learning (trajectory-based RL)
  • Precision: bfloat16
  • Parameters: ~4.4B
  • Transformers Version: 4.57.3

Usage

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor

model = Qwen3VLForConditionalGeneration.from_pretrained(
    "DavidBShan/Clado-BrowserOS-Action",
    torch_dtype="bfloat16",
    device_map="auto",
)
processor = AutoProcessor.from_pretrained(
    "DavidBShan/Clado-BrowserOS-Action"
)

Training

This model was trained using trajectory-based reinforcement learning on web navigation tasks in a BrowserOS environment. The RL training was performed on top of a Qwen3-VL-4B base model.

Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support