North Code Quant
GGUF Code Generation
High-performance quantized GGUF builds of Cohere's North Code model.
Optimized for local inference via llama.cpp, LM Studio, and Ollama.
Base Model
Cohere North Code
Architecture
Cohere / Command-R
Context Length
128K Tokens
License
CC-BY-NC / Custom
โก Quick Start
LM Studio
Search for "North Code Quant" in the LM Studio search bar, select your preferred quantization level from the sidebar, and click Download.
llama.cpp
Bash
./llama-cli -m north-code-quant-Q4_K_M.gguf \
--ctx-size 8192 \
--threads $(nproc) \
--prompt "def fibonacci(n):"
๐ฆ Available Quants
Files are sorted by size and quality. Q4_K_M is recommended for most users as the best balance of speed and perplexity.
| File Name | Quant Type | Size | Description |
|---|---|---|---|
North-Code-Quant.gguf |
Q8_0 | -- GB | Near-lossless. Best quality, higher VRAM/RAM requirement. |
๐ About This Quantization
These GGUF files were converted from the official
Cohere North Code
weights using llama.cpp with importance matrix calibration for optimal token-level precision retention.
โ ๏ธ Disclaimer: This is a quantized derivative model. While quants retain most of the base model's capabilities,
lower-bit quantizations may exhibit degraded performance in edge-case code generation or multilingual tasks.
Always verify generated code before execution. This model inherits the license terms of the original Cohere North Code model.
- Downloads last month
- 33
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support