Cecilia 2B Instruct v1 - GGUF

This repository contains quantized GGUF versions of the gia-uh/cecilia-2b-instruct-v1 model.

These files are optimized for efficient local inference on CPUs and GPUs using tools like llama.cpp, Ollama, LM Studio, GPT4All, and others.

📦 Available Files

Below are the available quantization formats. Since this is a 2B parameter model, it is recommended to use the highest precision your hardware allows to maintain coherence.

Filename Type Size (approx) Description & Recommended Use
cecilia-2b-instruct-v1-Q8_0.gguf Q8_0 2.24 GB Max Fidelity. Almost indistinguishable from the original.
cecilia-2b-instruct-v1-Q6_K.gguf Q6_K 1.79 GB Perfect Balance. High quality with considerable memory savings.
cecilia-2b-instruct-v1-Q4_K_M.gguf Q4_K_M 1.4 GB High Speed. Best compression but with some losing. Best for older laptops or mobile devices.

Downloads last month
24
GGUF
Model size
2B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gia-uh/cecilia-2b-instruct-v1-GGUF

Quantized
(3)
this model