SetFit ModernBERT-base WAMP Router (Optimized ONNX)

This is a specialized SetFit model based on ModernBERT-base, exported to ONNX and optimized for high-performance LLM context pruning.

Key Features

8192 Token Window: Native support for extremely long messages without sliding window overhead.
Memory Optimized: Only the last 2 layers of attention (20 & 21) are exported to prevent "Bad Allocation" errors in long contexts.
3-Class Intent Classification: 100% accuracy in routing tasks into Summary, Needle, and Reasoning.
Dual Precision: Includes both FP32 (model.onnx) and INT8 Quantized (model_quantized.onnx) versions.

Classification Map

Label 0: Summary (Chatter, Recaps, TL;DR)
Label 1: Needle (Pinpoint facts, parameters, keys, IPs)
Label 2: Reasoning (Comparison, analysis, logical chains)

Project Origin

This model is the primary engine for the WAMP-proxy project.

Usage in WAMP-proxy

Update your .env file:

FILTER_MODEL_DIR=./model_modernbert_onnx
FILTER_MAX_TOKENS=2048 # Can be up to 8192
FILTER_NEEDLE_ALGO=cls_max
FILTER_REASONING_ALGO=max_max
FILTER_SUMMARY_ALGO=max_max

License

MIT

Downloads last month: 16

Model tree for naranor/SetFit-ModernBERT-WAMP-V1

Base model

answerdotai/ModernBERT-base

Quantized

(26)

this model