Stheno v3.4 on WebGPU

Sao10K Stheno v3.4 (Llama-3.1-8B) running in browser via WebGPU. Upgrade from the v3.2 MLC package. 4.6 GB Q4_K_M.

The voice model behind the Garden entity system. Character embodiment, creative writing, entity voice.

Quick Start

  1. Download Q4_K_M GGUF from bartowski
  2. Split: llama-gguf-split --split --split-max-size 1G
  3. Place in model_splits/
  4. node serve.js (port 8210)
  5. Open http://localhost:8210

Credits

Built by Joshua (LJTSG) and Claude. Model by Sao10K. Co-Authored-By: Claude noreply@anthropic.com

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LJTSG/Stheno-v3.4-webgpu

Finetuned
(1)
this model