Instructions to use AbdoSaad24/deepseek-coder-6.7b-security-dpo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AbdoSaad24/deepseek-coder-6.7b-security-dpo with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AbdoSaad24/deepseek-coder-6.7b-security-dpo")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AbdoSaad24/deepseek-coder-6.7b-security-dpo")
model = AutoModelForCausalLM.from_pretrained("AbdoSaad24/deepseek-coder-6.7b-security-dpo")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AbdoSaad24/deepseek-coder-6.7b-security-dpo with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AbdoSaad24/deepseek-coder-6.7b-security-dpo"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AbdoSaad24/deepseek-coder-6.7b-security-dpo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/AbdoSaad24/deepseek-coder-6.7b-security-dpo

SGLang

How to use AbdoSaad24/deepseek-coder-6.7b-security-dpo with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AbdoSaad24/deepseek-coder-6.7b-security-dpo" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AbdoSaad24/deepseek-coder-6.7b-security-dpo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AbdoSaad24/deepseek-coder-6.7b-security-dpo" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AbdoSaad24/deepseek-coder-6.7b-security-dpo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use AbdoSaad24/deepseek-coder-6.7b-security-dpo with Docker Model Runner:
```
docker model run hf.co/AbdoSaad24/deepseek-coder-6.7b-security-dpo
```

deepseek-coder-6.7b-security-dpo

A Direct Preference Optimization (DPO) fine-tune of deepseek-ai/deepseek-coder-6.7b-instruct specialized for generating secure, vulnerability-free Python code from natural language task descriptions — without requiring a separate audit or review step.

Kaggle notebook: code-generation

What makes this model different

Most code generation models produce the most straightforward implementation of a task — which is often the most vulnerable one. A naive model asked to "write a login function that queries a user by username" will likely produce an f-string SQL query wide open to injection.

This model was trained with DPO to treat the secure implementation as always preferred over the naive one, for the exact same task. It writes safe code on the first pass — no second-pass auditor or reviewer needed.

User task description
  (e.g. "write a Flask login route")
          │
          ▼
  [deepseek-coder-6.7b-security-dpo]
          │
          ▼
  Secure code — parameterized queries,
  env-var credentials, no shell=True, bcrypt passwords
  (no separate audit step required)

Vulnerability classes prevented by default

Vulnerability	What a naive model writes	What this model writes instead
SQL injection	`f"SELECT * FROM users WHERE name='{name}'"`	Parameterized query with `%s` placeholder
Hardcoded credentials	`DATABASE_URL = "postgres://admin:secret@..."`	Credentials read from `os.environ`
Command injection	`subprocess.run(f"python {filename}", shell=True)`	Argument list + path validation, `shell=False`
Insecure deserialization	`pickle.loads(cookie_bytes)`	`json.loads` with schema validation or signed token
Weak password storage	`db.save(password=password)`	`bcrypt.hashpw` / `argon2` hashing
XSS	Unescaped template interpolation	Output escaping, template auto-escape enabled

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_ID = "AbdoSaad24/deepseek-coder-6.7b-security-dpo"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)
model.eval()

SYSTEM_PROMPT = "\n".join([
    "You are a senior software security engineer and Python developer.",
    "When given a task description, write clean, working Python code that is secure by default.",
])

def generate_secure_code(task_description: str, max_new_tokens: int = 512) -> str:
    """Generate secure Python code for a given task description."""
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user",   "content": f"Write secure Python code that performs the following task:\n\n{task_description}"}
    ]
    text = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )
    inputs = tokenizer(text, return_tensors="pt").to(model.device)

    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            do_sample=False,
            temperature=1.0,
            pad_token_id=tokenizer.eos_token_id,
        )

    generated = outputs[0][inputs["input_ids"].shape[1]:]
    return tokenizer.decode(generated, skip_special_tokens=True)

Example: database user lookup

print(generate_secure_code(
    "Write a Python function that looks up a user record from a "
    "PostgreSQL database by username."
))
# → Uses psycopg2 with cursor.execute(query, (username,))
# → Never uses f-strings or .format() in SQL

Example: database connection setup

print(generate_secure_code(
    "Write a Python function that connects to a PostgreSQL database "
    "and returns the connection object."
))
# → Reads host, user, password from os.environ
# → Never hardcodes credentials in source

Example: user registration route

print(generate_secure_code(
    "Write a Flask route that accepts a username and password "
    "and registers a new user in a SQLite database."
))
# → Hashes password with bcrypt before storing
# → Uses parameterized INSERT statement

Example: script runner

print(generate_secure_code(
    "Write a Python function that runs a report script "
    "given a filename provided by the user."
))
# → Uses subprocess.run(["python", validated_path], shell=False)
# → Validates filename against allowlist before executing

Example: session loading

print(generate_secure_code(
    "Write a Python function that loads a user session object "
    "from data stored in a cookie."
))
# → Uses json.loads or itsdangerous signed token
# → Never uses pickle.loads on untrusted input

System prompt

The model was fine-tuned with the following security-focused system prompt injected at training time. Use it at inference for best results:

You are a senior software security engineer and Python developer.

When given a task description, you write clean, working Python code
that is secure by default — without needing a separate security review.

Security standards you always follow:
- Never construct SQL queries with string formatting or concatenation; always use parameterized queries
- Never hardcode credentials, API keys, tokens, or secrets; always read from environment variables
- Never use shell=True with user-supplied input; always use argument lists
- Never deserialize untrusted data with pickle; use safe alternatives (json, etc.)
- Always validate and sanitize user input before use
- Use strong hashing (bcrypt / argon2) for passwords; never md5 or sha1
- Apply least-privilege: request only the permissions the code actually needs

Output Rules:
- Return ONLY the code implementation
- Add brief inline comments where a security choice might not be obvious
- Do NOT include explanations or prose outside the code block

Training details

Base model

deepseek-ai/deepseek-coder-6.7b-instruct — the instruction-tuned DeepSeek-Coder variant, fine-tuned further with DPO to align code generation preferences toward security.

Why DPO and not SFT?

DPO trains the model to prefer one response over another for the same prompt — which is exactly the right signal for security alignment. For every task (e.g. "write a login function"), there exists both a secure and a vulnerable implementation. DPO teaches the model that the secure one is always preferred, without needing a separate reward model. SFT alone would only teach the model to imitate the secure examples, without learning to avoid the vulnerable patterns.

Dataset

Dataset	Role	Description
`CyberNative/Code_Vulnerability_Security_DPO`	Primary DPO training	Chosen = secure implementation of task; Rejected = vulnerable implementation of the same task

Dataset filtering was applied before training — a critical step omitted in most notebooks. Pairs were filtered by:

Language: Python and C++ only
Minimum prompt length: 50 characters
Minimum chosen response length: 80 characters
Minimum rejected response length: 20 characters
Chosen and rejected must differ (identical pairs removed)
Minimum length delta between chosen and rejected: 30 characters

Stage	Count
Raw dataset	4,656 pairs
After filtering	748 pairs
Train split (70%)	523 pairs
Eval split (30%)	225 pairs

Fine-tuning method: DPO QLoRA via LLaMA-Factory

Hyperparameter	Value	Rationale
Framework	LLaMA-Factory 0.9.5
Stage	DPO	Direct preference, no reward model needed
Fine-tuning type	LoRA (QLoRA 4-bit)	Fits 6.7B model in T4 16 GB VRAM
Chat template	`deepseek`
LoRA rank	16	Sufficient for domain adaptation; higher risks overfitting on small DPO set
LoRA alpha	32	2× rank — stable default
LoRA dropout	0.05
LoRA target modules	`q_proj, k_proj, v_proj, o_proj`	Attention layers carry the preference signal most efficiently in DPO
DPO β (pref_beta)	0.1	Standard KL penalty — keeps output fluent while learning preference
DPO loss	Sigmoid	Standard DPO loss formulation
Quantization	4-bit NF4 (QLoRA)
Context length	2048 tokens	Covers full task prompt + complete code implementations
Batch size per device	1
Gradient accumulation steps	8 (effective batch = 8)
Learning rate	5e-5	Empirically stable for DPO; higher rates collapse the reward margin
LR scheduler	Cosine
Warmup steps	10
Epochs	5
Mixed precision	FP16	T4 does not support BF16
Optimizer	AdamW (torch)
Best model selection	Lowest eval loss
Hardware	NVIDIA Tesla T4 — 15.6 GB VRAM (Kaggle)
Experiment tracking	Weights & Biases (`deepseek-coder-security-dpo`)

After training, LoRA adapters were merged into the base model weights using LLaMA-Factory's export pipeline on CPU and pushed as a single standalone model.

Evaluation test cases

The model was evaluated on 5 task descriptions specifically chosen because a naive model would produce a vulnerable implementation for each:

Task	Naive vulnerability	Secure pattern expected
Database user lookup	SQL injection via f-string	Parameterized query with `%s`
Database connection setup	Hardcoded credentials in source	Credentials from `os.environ`
Script runner	`subprocess.run(shell=True)` with f-string	Argument list + path validation
User session loading	`pickle.loads` on untrusted cookie	`json.loads` with validation or signed token
User registration (Flask)	Plain-text password + string-interpolated SQL	`bcrypt` hash + parameterized INSERT

Intended use

This model is designed as a secure-by-default code generation backend for:

Coding assistants and IDE integrations where security matters
Agentic pipelines that generate code without a separate audit node
Internal developer tools at security-conscious organisations
Educational environments teaching secure coding patterns
Any workflow where generated code goes closer to production

Out-of-scope use

Languages other than Python and C++ (training data covers these primarily)
Generating code for malicious or offensive purposes
Production use without any human review — the model significantly reduces vulnerabilities but is not a formal security audit

Limitations

Training dataset is relatively small (523 pairs after filtering); the model may not generalise to all vulnerability classes or uncommon patterns
The model focuses on common vulnerability categories covered in the training data — novel or highly domain-specific vulnerabilities may not be handled
Generated code should still be reviewed by a developer before deployment
Performance on languages other than Python and C++ is not guaranteed
The model does not perform static analysis — it generates based on learned preferences, not formal verification

Citation

If you use this model, please cite the original DeepSeek-Coder work:

@misc{guo2024deepseekcoderlargelanguagemodel,
  title={DeepSeek-Coder: When the Large Language Model Meets Programming},
  author={Daya Guo et al.},
  year={2024},
  eprint={2401.14196},
  archivePrefix={arXiv}
}

Fine-tuned by AbdoSaad24 · Kaggle notebook: code-generation

Downloads last month: 13

Safetensors

Model size

7B params

Tensor type

BF16

Model tree for AbdoSaad24/deepseek-coder-6.7b-security-dpo

Base model

deepseek-ai/deepseek-coder-6.7b-instruct

Adapter

(391)

this model

Dataset used to train AbdoSaad24/deepseek-coder-6.7b-security-dpo

Paper for AbdoSaad24/deepseek-coder-6.7b-security-dpo

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 72