Instructions to use cstr/marblenet-vad-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use cstr/marblenet-vad-GGUF with NeMo:
# tag did not correspond to a valid NeMo domain.
- Notebooks
- Google Colab
- Kaggle
MarbleNet VAD -- GGUF
GGUF conversion of nvidia/Frame_VAD_Multilingual_MarbleNet_v2.0 for use with CrispStrobe/CrispASR.
Available variants
| File | Size | Notes |
|---|---|---|
marblenet-vad.gguf |
439 KB | F32, BatchNorm fused into conv weights |
No quantization needed โ model is already 91.5K params (439 KB).
Model details
- Architecture: MarbleNet โ 1D time-channel separable CNN (6 Jasper blocks: depthwise conv + pointwise conv + BN + ReLU)
- Parameters: 91.5K (smallest VAD model in CrispASR)
- Languages: Chinese, English, French, German, Russian, Spanish
- Input: 80-bin mel spectrogram (16kHz, 512 FFT, 25ms window, 10ms stride)
- Output: per-frame speech probability (20ms per frame)
- Training data: 2,600h real + 1,000h synthetic + 330h noise
- License: NVIDIA Open Model License (commercial use OK)
Benchmark (ROC-AUC from NVIDIA)
| Dataset | ROC-AUC |
|---|---|
| VoxConverse-test | 96.65 |
| VoxConverse-dev | 97.59 |
| AMI-test | 96.25 |
| Earnings21 | 97.11 |
Usage with CrispASR
# Auto-download (439 KB)
crispasr --backend whisper -m auto --auto-download --vad -vm marblenet -f audio.wav
# Or with explicit path
crispasr --backend parakeet -m auto --auto-download --vad -vm marblenet-vad.gguf -f audio.wav
CrispASR VAD comparison
| VAD Model | Size | Latency | Languages |
|---|---|---|---|
| Silero VAD v5 | 0.9 MB | ~60 ms | Multilingual |
| MarbleNet | 0.4 MB | ~30 ms | 6 languages |
| FireRedVAD | 2.4 MB | ~50 ms | 100+ (recommended) |
| Whisper-VAD-ASMR | 22 MB | ~1000 ms | Experimental |
Conversion
python models/convert-marblenet-vad-to-gguf.py \
--input nvidia/Frame_VAD_Multilingual_MarbleNet_v2.0 \
--output marblenet-vad.gguf
BatchNorm layers are fused into convolution weights at convert time (36 tensors from 84 original).
- Downloads last month
- 105
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for cstr/marblenet-vad-GGUF
Base model
nvidia/Frame_VAD_Multilingual_MarbleNet_v2.0