Instructions to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("sallani/EUAIAct-Qwen2.5-0.5B-Edge") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - llama-cpp-python
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="sallani/EUAIAct-Qwen2.5-0.5B-Edge", filename="euaiact-qwen2.5-0.5b-q4_k_m.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M # Run inference directly in the terminal: llama-cli -hf sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M # Run inference directly in the terminal: llama-cli -hf sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
Use Docker
docker model run hf.co/sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "sallani/EUAIAct-Qwen2.5-0.5B-Edge" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sallani/EUAIAct-Qwen2.5-0.5B-Edge", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
- Ollama
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with Ollama:
ollama run hf.co/sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
- Unsloth Studio
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sallani/EUAIAct-Qwen2.5-0.5B-Edge to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sallani/EUAIAct-Qwen2.5-0.5B-Edge to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for sallani/EUAIAct-Qwen2.5-0.5B-Edge to start chatting
- Pi
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "sallani/EUAIAct-Qwen2.5-0.5B-Edge"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "sallani/EUAIAct-Qwen2.5-0.5B-Edge" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "sallani/EUAIAct-Qwen2.5-0.5B-Edge"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default sallani/EUAIAct-Qwen2.5-0.5B-Edge
Run Hermes
hermes
- MLX LM
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "sallani/EUAIAct-Qwen2.5-0.5B-Edge"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "sallani/EUAIAct-Qwen2.5-0.5B-Edge" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sallani/EUAIAct-Qwen2.5-0.5B-Edge", "messages": [ {"role": "user", "content": "Hello"} ] }' - Docker Model Runner
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with Docker Model Runner:
docker model run hf.co/sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
- Lemonade
How to use sallani/EUAIAct-Qwen2.5-0.5B-Edge with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull sallani/EUAIAct-Qwen2.5-0.5B-Edge:Q4_K_M
Run and chat with the model
lemonade run user.EUAIAct-Qwen2.5-0.5B-Edge-Q4_K_M
List all available models
lemonade list
llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
)EUAIAct-Qwen2.5-0.5B-Edge
A compact, offline-first Language Model specialized in EU AI Act & GDPR
Runs on mobile browsers · laptops · edge servers — no cloud, no API, no data transfer
Overview
EUAIAct-Qwen2.5-0.5B-Edge is a 494M parameter Small Language Model fine-tuned on the EU AI Act (Regulation EU 2024/1689) and GDPR compliance corpus.
It is designed to run entirely on-device — in a mobile browser via WebGPU, on Apple Silicon via MLX, or on any machine via GGUF. No internet connection required after download.
Advisory use only — not a substitute for qualified legal counsel.
✦ Key Features
- Offline-first — works without internet once downloaded
- Cross-platform — Windows · macOS · Linux · Mobile · Browser
- Mobile-ready — runs in browser via Transformers.js + WebGPU
- Multilingual — French 🇫🇷 and English 🇬🇧
- Lightweight — 350 MB (GGUF) to 1 GB (MLX full precision)
- Open weights — Apache 2.0, fully auditable
Deployment formats
| Format | Size | Runtime | Best for |
|---|---|---|---|
| ONNX INT8 | ~400 MB | Transformers.js | Browser · Mobile · WebGPU — Windows · macOS · Linux |
| GGUF Q4_K_M | ~350 MB | llama.cpp | Windows · macOS · Linux · Edge servers |
| MLX BF16 | ~1.0 GB | mlx_lm | Apple Silicon only (M1/M2/M3/M4) |
Quickstart
🌐 Browser & Mobile — Transformers.js + WebGPU
import { pipeline } from '@xenova/transformers';
const ai = await pipeline(
'text-generation',
'sallani/EUAIAct-Qwen2.5-0.5B-Edge',
{ device: 'webgpu' }
);
const response = await ai(
"What are the obligations for high-risk AI system providers under Article 16?",
{ max_new_tokens: 400 }
);
console.log(response[0].generated_text);
🍎 Apple Silicon — MLX
from mlx_lm import load, generate
model, tokenizer = load("sallani/EUAIAct-Qwen2.5-0.5B-Edge")
messages = [
{"role": "system", "content": "Tu es un expert EU AI Act et RGPD."},
{"role": "user", "content": "Qu'est-ce qu'un système IA à haut risque selon l'Article 6 ?"}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(generate(model, tokenizer, prompt=prompt, max_tokens=512))
🖥️ llama.cpp — Windows · macOS · Linux
llama-cli \
-m euaiact-qwen2.5-0.5b-q4_k_m.gguf \
--chat-template qwen \
-p "Explain the EU AI Act conformity assessment procedure." \
-n 512
Model details
| Property | Value |
|---|---|
| Base model | Qwen/Qwen2.5-0.5B-Instruct (Apache 2.0) |
| Fine-tuning method | LoRA — r=8, α=16, 8 layers |
| Training hardware | Apple M4 Max · MLX |
| Parameters | 494M |
| Training data | 300 curated Q&A pairs — EU AI Act · GDPR |
| Languages | French 🇫🇷 · English 🇬🇧 |
| Context length | 1 024 tokens |
| License | Apache 2.0 |
Training corpus
Covers the full regulatory stack for AI compliance in the EU:
| Regulation | Coverage |
|---|---|
| 🏛️ EU AI Act (2024/1689) | Articles 1–113, Annexes I–XIII, risk classification, provider & deployer obligations, GPAI, conformity assessment |
| 🔒 GDPR | AI training data, DPIAs, data subject rights, lawful basis |
Performance
| Format | RAM | Speed (CPU) | Speed (Apple Silicon) |
|---|---|---|---|
| MLX BF16 | ~2 GB | — | ~80 tok/s (M4 Max) |
| ONNX INT8 | ~1.2 GB | ~20 tok/s | ~60 tok/s (WebGPU) |
| GGUF Q4_K_M | ~1 GB | ~25 tok/s | ~90 tok/s |
Intended use
| Use case | |
|---|---|
| EU AI Act compliance Q&A | ✅ |
| GDPR guidance for AI systems | ✅ |
| AI risk classification assistance | ✅ |
| GRC documentation support | ✅ |
| Formal legal advice | ❌ |
| Autonomous compliance decisions | ❌ |
Target users: CISOs, DPOs, GRC consultants, legal counsel, compliance officers.
EU AI Act self-assessment
This model itself falls under limited risk (Article 50 — conversational AI system):
- Disclosure obligation: users must be informed they are interacting with AI
- No autonomous decisions — advisory only
- No personal data processed — on-device inference only
- Below GPAI threshold (< 10²³ training FLOPs)
Citation
@misc{allani2026euaiactedge,
author = {Allani, Sabri},
title = {EUAIAct-Qwen2.5-0.5B-Edge: A Sovereign Edge SLM for EU AI Act and GDPR Compliance},
year = {2026},
url = {https://huggingface.co/sallani/EUAIAct-Qwen2.5-0.5B-Edge},
note = {LoRA fine-tune of Qwen2.5-0.5B-Instruct on EU AI Act and GDPR corpus. Apache 2.0.}
}
EU AI Act · GDPR · Edge · Mobile · Offline · Open Weights
Apache 2.0 — Free to use, modify, and distribute
- Downloads last month
- 154
Quantized
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="sallani/EUAIAct-Qwen2.5-0.5B-Edge", filename="euaiact-qwen2.5-0.5b-q4_k_m.gguf", )