Instructions to use refortifai/Qwen3-4B-obfuscated with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use refortifai/Qwen3-4B-obfuscated with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="refortifai/Qwen3-4B-obfuscated") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("refortifai/Qwen3-4B-obfuscated") model = AutoModelForCausalLM.from_pretrained("refortifai/Qwen3-4B-obfuscated") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use refortifai/Qwen3-4B-obfuscated with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "refortifai/Qwen3-4B-obfuscated" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "refortifai/Qwen3-4B-obfuscated", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/refortifai/Qwen3-4B-obfuscated
- SGLang
How to use refortifai/Qwen3-4B-obfuscated with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "refortifai/Qwen3-4B-obfuscated" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "refortifai/Qwen3-4B-obfuscated", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "refortifai/Qwen3-4B-obfuscated" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "refortifai/Qwen3-4B-obfuscated", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use refortifai/Qwen3-4B-obfuscated with Docker Model Runner:
docker model run hf.co/refortifai/Qwen3-4B-obfuscated
Qwen3-4B (refortif.ai Obfuscated)
We obfuscated an AI model using a novel post-training transformation. Can you figure out how?
The Challenge
We've published two models on HuggingFace: the original Qwen3-4B and this refortif.ai-obfuscated version of the same model. Your goal: figure out the mathematical transform we applied to the weights.
Key Facts
- The transformation is applied after training. No extra training, fine-tuning, or special training procedure is required.
- The obfuscated model runs on the refortif.ai runtime with minimal performance overhead.
- The complete model never appears in plain form: not at rest, not in transit, and not in VRAM during inference.
- Standard vLLM cannot produce correct output from the obfuscated weights. Try it yourself.
The Models
| Model | Link | |
|---|---|---|
| Original | Qwen3-4B | huggingface.co/Qwen/Qwen3-4B |
| Obfuscated | Qwen3-4B (refortif.ai) | huggingface.co/refortifai/Qwen3-4B-obfuscated |
Download both models, compare the weights, and reverse-engineer the transformation.
Model Details
| Base Model | Qwen/Qwen3-4B |
| Parameters | 4 billion |
| Tensor Type | BF16 |
| Format | Safetensors |
| Hidden Size | 2560 |
| Layers | 36 |
| Attention Heads | 32 (8 KV heads, GQA) |
| Head Dimension | 128 |
| Intermediate Size | 9728 |
| License | Apache 2.0 |
The architecture and config are identical to the original Qwen3-4B. Only the weights have been transformed.
How to Download
huggingface-cli download refortifai/Qwen3-4B-obfuscated --local-dir ./Qwen3-4B-obfuscated
To download the original for comparison:
huggingface-cli download Qwen/Qwen3-4B --local-dir ./Qwen3-4B
Compare the Weights
We've open-sourced a visual diff tool to help you get started. It loads both models one tensor at a time (memory-efficient) and gives you per-layer statistics, cosine similarity, histograms, heatmaps, and more.
github.com/refortif-ai/diffstat
git clone https://github.com/refortif-ai/diffstat.git
cd diffstat
pip install -e .
python -m diff_qwen models/Qwen3-4B models/Qwen3-4B-obfuscated
Then open http://localhost:8787 in your browser.
Try It Yourself
Load the obfuscated model in any standard framework and see what happens:
> The meaning of life is
████████████████████ (garbage output)
The obfuscated weights produce completely unusable output in vLLM, HuggingFace Transformers, or any other standard inference engine. On the refortif.ai runtime, the same weights produce correct, coherent output with minimal overhead.
Submit Your Findings
Think you've cracked it? We want to hear from you. Send us your insights, approaches, and analysis. Partial findings are welcome too.
About refortif.ai
Fearless model distribution with zero IP theft risk.
refortif.ai provides post-training model obfuscation that lets you ship your models anywhere without exposing your weights. The obfuscated model is mathematically unusable without the refortif.ai runtime, but runs with near-zero performance overhead when properly deployed.
Want to obfuscate your own models? contact@refortif.ai
- Downloads last month
- 15