Qwen3-VL WebAgent BrowserOS
This model is a Qwen3-VL-4B fine-tuned with reinforcement learning for web agent tasks in BrowserOS environments.
Model Details
- Base Model: Qwen/Qwen3-VL-4B
- Architecture:
Qwen3VLForConditionalGeneration - Training Method: Reinforcement Learning (trajectory-based RL)
- Precision: bfloat16
- Parameters: ~4.4B
- Transformers Version: 4.57.3
Usage
from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
model = Qwen3VLForConditionalGeneration.from_pretrained(
"DavidBShan/Clado-BrowserOS-Action",
torch_dtype="bfloat16",
device_map="auto",
)
processor = AutoProcessor.from_pretrained(
"DavidBShan/Clado-BrowserOS-Action"
)
Training
This model was trained using trajectory-based reinforcement learning on web navigation tasks in a BrowserOS environment. The RL training was performed on top of a Qwen3-VL-4B base model.
- Downloads last month
- -