Instructions to use refortifai/Qwen3-4B-obfuscated with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use refortifai/Qwen3-4B-obfuscated with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="refortifai/Qwen3-4B-obfuscated")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("refortifai/Qwen3-4B-obfuscated")
model = AutoModelForCausalLM.from_pretrained("refortifai/Qwen3-4B-obfuscated")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use refortifai/Qwen3-4B-obfuscated with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "refortifai/Qwen3-4B-obfuscated"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "refortifai/Qwen3-4B-obfuscated",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/refortifai/Qwen3-4B-obfuscated

SGLang

How to use refortifai/Qwen3-4B-obfuscated with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "refortifai/Qwen3-4B-obfuscated" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "refortifai/Qwen3-4B-obfuscated",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "refortifai/Qwen3-4B-obfuscated" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "refortifai/Qwen3-4B-obfuscated",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use refortifai/Qwen3-4B-obfuscated with Docker Model Runner:
```
docker model run hf.co/refortifai/Qwen3-4B-obfuscated
```

Qwen3-4B (refortif.ai Obfuscated)

We obfuscated an AI model using a novel post-training transformation. Can you figure out how?

Take the challenge →

The Challenge

We've published two models on HuggingFace: the original Qwen3-4B and this refortif.ai-obfuscated version of the same model. Your goal: figure out the mathematical transform we applied to the weights.

Key Facts

The transformation is applied after training. No extra training, fine-tuning, or special training procedure is required.
The obfuscated model runs on the refortif.ai runtime with minimal performance overhead.
The complete model never appears in plain form: not at rest, not in transit, and not in VRAM during inference.
Standard vLLM cannot produce correct output from the obfuscated weights. Try it yourself.

The Models

	Model	Link
Original	Qwen3-4B	huggingface.co/Qwen/Qwen3-4B
Obfuscated	Qwen3-4B (refortif.ai)	huggingface.co/refortifai/Qwen3-4B-obfuscated

Download both models, compare the weights, and reverse-engineer the transformation.

Model Details


Base Model	Qwen/Qwen3-4B
Parameters	4 billion
Tensor Type	BF16
Format	Safetensors
Hidden Size	2560
Layers	36
Attention Heads	32 (8 KV heads, GQA)
Head Dimension	128
Intermediate Size	9728
License	Apache 2.0

The architecture and config are identical to the original Qwen3-4B. Only the weights have been transformed.

How to Download

huggingface-cli download refortifai/Qwen3-4B-obfuscated --local-dir ./Qwen3-4B-obfuscated

To download the original for comparison:

huggingface-cli download Qwen/Qwen3-4B --local-dir ./Qwen3-4B

Compare the Weights

We've open-sourced a visual diff tool to help you get started. It loads both models one tensor at a time (memory-efficient) and gives you per-layer statistics, cosine similarity, histograms, heatmaps, and more.

github.com/refortif-ai/diffstat

git clone https://github.com/refortif-ai/diffstat.git
cd diffstat
pip install -e .
python -m diff_qwen models/Qwen3-4B models/Qwen3-4B-obfuscated

Then open http://localhost:8787 in your browser.

Try It Yourself

Load the obfuscated model in any standard framework and see what happens:

> The meaning of life is
  ████████████████████ (garbage output)

The obfuscated weights produce completely unusable output in vLLM, HuggingFace Transformers, or any other standard inference engine. On the refortif.ai runtime, the same weights produce correct, coherent output with minimal overhead.

Submit Your Findings

Think you've cracked it? We want to hear from you. Send us your insights, approaches, and analysis. Partial findings are welcome too.

challenge@refortif.ai

About refortif.ai

Fearless model distribution with zero IP theft risk.

refortif.ai provides post-training model obfuscation that lets you ship your models anywhere without exposing your weights. The obfuscated model is mathematically unusable without the refortif.ai runtime, but runs with near-zero performance overhead when properly deployed.

Want to obfuscate your own models? contact@refortif.ai

Downloads last month: 15

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for refortifai/Qwen3-4B-obfuscated

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Finetuned

(693)

this model