Text Classification
setfit
ONNX
modernbert
attention-weights
context-compression
intent-classification
multilingual
Instructions to use naranor/SetFit-ModernBERT-WAMP-V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- setfit
How to use naranor/SetFit-ModernBERT-WAMP-V1 with setfit:
from setfit import SetFitModel model = SetFitModel.from_pretrained("naranor/SetFit-ModernBERT-WAMP-V1") - Notebooks
- Google Colab
- Kaggle
SetFit ModernBERT-base WAMP Router (Optimized ONNX)
This is a specialized SetFit model based on ModernBERT-base, exported to ONNX and optimized for high-performance LLM context pruning.
Key Features
- 8192 Token Window: Native support for extremely long messages without sliding window overhead.
- Memory Optimized: Only the last 2 layers of attention (20 & 21) are exported to prevent "Bad Allocation" errors in long contexts.
- 3-Class Intent Classification: 100% accuracy in routing tasks into Summary, Needle, and Reasoning.
- Dual Precision: Includes both FP32 (
model.onnx) and INT8 Quantized (model_quantized.onnx) versions.
Classification Map
- Label 0: Summary (Chatter, Recaps, TL;DR)
- Label 1: Needle (Pinpoint facts, parameters, keys, IPs)
- Label 2: Reasoning (Comparison, analysis, logical chains)
Project Origin
This model is the primary engine for the WAMP-proxy project.
Usage in WAMP-proxy
Update your .env file:
FILTER_MODEL_DIR=./model_modernbert_onnx
FILTER_MAX_TOKENS=2048 # Can be up to 8192
FILTER_NEEDLE_ALGO=cls_max
FILTER_REASONING_ALGO=max_max
FILTER_SUMMARY_ALGO=max_max
License
MIT
- Downloads last month
- 16
Model tree for naranor/SetFit-ModernBERT-WAMP-V1
Base model
answerdotai/ModernBERT-base