Instructions to use scthornton/qwen-coder-7b-securecode with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use scthornton/qwen-coder-7b-securecode with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="scthornton/qwen-coder-7b-securecode")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("scthornton/qwen-coder-7b-securecode", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use scthornton/qwen-coder-7b-securecode with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "scthornton/qwen-coder-7b-securecode"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scthornton/qwen-coder-7b-securecode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/scthornton/qwen-coder-7b-securecode

SGLang

How to use scthornton/qwen-coder-7b-securecode with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "scthornton/qwen-coder-7b-securecode" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scthornton/qwen-coder-7b-securecode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "scthornton/qwen-coder-7b-securecode" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "scthornton/qwen-coder-7b-securecode",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use scthornton/qwen-coder-7b-securecode with Docker Model Runner:
```
docker model run hf.co/scthornton/qwen-coder-7b-securecode
```

qwen-coder-7b-securecode / README.md

scthornton

Add paper citation and comprehensive model card (#2)

3d1c41f verified 4 months ago

preview code

raw

history blame contribute delete

13.7 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen2.5-Coder-7B-Instruct
	tags:
	- code
	- security
	- qwen
	- securecode
	- owasp
	- vulnerability-detection
	datasets:
	- scthornton/securecode-v2
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	arxiv: 2512.18542
	---

	# Qwen 2.5-Coder 7B - SecureCode Edition

	<div align="center">

	[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
	[![Training Dataset](https://img.shields.io/badge/dataset-SecureCode%20v2.0-green.svg)](https://huggingface.co/datasets/scthornton/securecode-v2)
	[![Base Model](https://img.shields.io/badge/base-Qwen%202.5%20Coder%207B-orange.svg)](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
	[![perfecXion.ai](https://img.shields.io/badge/by-perfecXion.ai-purple.svg)](https://perfecxion.ai)

	Best-in-class code model fine-tuned for security - exceptional code understanding

	[📄 Paper](https://arxiv.org/abs/2512.18542) \| [🤗 Model Card](https://huggingface.co/scthornton/qwen-coder-7b-securecode) \| [📊 Dataset](https://huggingface.co/datasets/scthornton/securecode-v2) \| [💻 perfecXion.ai](https://perfecxion.ai) \| [🔒 Security Research](https://perfecxion.ai/security)

	</div>

	---

	## 🎯 What is This?

	This is Qwen 2.5-Coder 7B Instruct fine-tuned on the SecureCode v2.0 dataset - widely recognized as the best code model available in the 7B parameter class, now enhanced with production-grade security knowledge.

	Unlike standard code models that frequently generate vulnerable code, this model combines Qwen's exceptional code understanding with specific training to:

	✅ Recognize security vulnerabilities across 11 programming languages
	✅ Generate secure implementations with defense-in-depth patterns
	✅ Explain complex attack vectors with concrete exploitation examples
	✅ Provide operational guidance including SIEM integration, logging, and monitoring

	The Result: The most capable security-aware code model under 10B parameters.

	Why Qwen 2.5-Coder? This model was pre-trained on 5.5 trillion tokens of code data, giving it:
	- 🎯 Superior code completion - Best-in-class for completing partial code
	- 🔍 Deep code understanding - Exceptional at analyzing complex codebases
	- 🌍 92 programming languages - Broader language support than competitors
	- 📏 128K context window - Can analyze entire files and multi-file contexts
	- ⚡ Fast inference - Optimized for production deployment

	---

	## 🚨 The Problem This Solves

	AI coding assistants produce vulnerable code in 45% of security-relevant scenarios (Veracode 2025). Standard code models excel at syntax but lack security awareness.

	Real-world costs:
	- Equifax breach (SQL injection): $425 million in damages
	- Capital One (SSRF attack): 100 million customer records exposed
	- SolarWinds (authentication bypass): 18,000 organizations compromised

	Qwen 2.5-Coder SecureCode Edition prevents these scenarios by combining world-class code generation with security expertise.

	---

	## 💡 Key Features

	### 🏆 Best Code Understanding in Class

	Qwen 2.5-Coder outperforms competitors on code benchmarks:
	- HumanEval: 88.2% pass@1
	- MBPP: 75.8% pass@1
	- LiveCodeBench: 35.1% pass@1
	- Better than CodeLlama 34B and comparable to GPT-4

	Now with 1,209 security-focused examples adding vulnerability awareness.

	### 🔐 Security-First Code Generation

	Trained on real-world security incidents including:
	- 224 examples of Broken Access Control vulnerabilities
	- 199 examples of Authentication Failures
	- 125 examples of Injection attacks (SQL, Command, XSS)
	- 115 examples of Cryptographic Failures
	- Complete coverage of OWASP Top 10:2025

	### 🌍 Multi-Language Security Expertise

	Fine-tuned on security examples across:
	- Python (Django, Flask, FastAPI)
	- JavaScript/TypeScript (Express, NestJS, React)
	- Java (Spring Boot)
	- Go (Gin framework)
	- PHP (Laravel, Symfony)
	- C# (ASP.NET Core)
	- Ruby (Rails)
	- Rust (Actix, Rocket)
	- Plus 84 more languages from Qwen's base training

	### 📋 Comprehensive Security Context

	Every response includes:
	1. Vulnerable implementation showing what NOT to do
	2. Secure implementation with industry best practices
	3. Attack demonstration proving the vulnerability is real
	4. Defense-in-depth guidance for production deployment

	---

	## 📊 Training Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base Model \| Qwen/Qwen2.5-Coder-7B-Instruct \|
	\| Fine-tuning Method \| LoRA (Low-Rank Adaptation) \|
	\| Training Dataset \| [SecureCode v2.0](https://huggingface.co/datasets/scthornton/securecode-v2) \|
	\| Dataset Size \| 841 training examples \|
	\| Training Epochs \| 3 \|
	\| LoRA Rank (r) \| 16 \|
	\| LoRA Alpha \| 32 \|
	\| Learning Rate \| 2e-4 \|
	\| Quantization \| 4-bit (bitsandbytes) \|
	\| Trainable Parameters \| 40.4M (0.53% of 7.6B total) \|
	\| Total Parameters \| 7.6B \|
	\| Context Window \| 128K tokens (inherited from base) \|
	\| GPU Used \| NVIDIA A100 40GB \|
	\| Training Time \| ~90 minutes (estimated) \|

	### Training Methodology

	LoRA (Low-Rank Adaptation) preserves Qwen's exceptional code abilities while adding security knowledge:
	- Trains only 0.53% of model parameters
	- Maintains base model's code generation quality
	- Adds security-specific knowledge without catastrophic forgetting
	- Enables deployment with minimal memory overhead

	4-bit Quantization enables efficient training while maintaining model quality.

	Extended Context: Qwen's 128K context window allows analyzing entire source files, making it ideal for security audits of large codebases.

	---

	## 🚀 Usage

	### Quick Start

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	# Load base model and tokenizer
	base_model = "Qwen/Qwen2.5-Coder-7B-Instruct"
	model = AutoModelForCausalLM.from_pretrained(
	base_model,
	device_map="auto",
	torch_dtype="auto",
	trust_remote_code=True
	)
	tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)

	# Load SecureCode LoRA adapter
	model = PeftModel.from_pretrained(model, "scthornton/qwen-coder-7b-securecode")

	# Generate secure code
	prompt = """### User:
	Review this Python Flask authentication code for security vulnerabilities:

	```python
	@app.route('/login', methods=['POST'])
	def login():
	username = request.form['username']
	password = request.form['password']
	query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"
	user = db.execute(query).fetchone()
	if user:
	session['user_id'] = user['id']
	return redirect('/dashboard')
	return 'Invalid credentials'
	```

	### Assistant:
	"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	**inputs,
	max_new_tokens=2048,
	temperature=0.7,
	top_p=0.95,
	do_sample=True
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	### Run on Consumer Hardware (4-bit)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel

	# 4-bit quantization - runs on 16GB GPU
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_use_double_quant=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype="bfloat16"
	)

	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen2.5-Coder-7B-Instruct",
	quantization_config=bnb_config,
	device_map="auto",
	trust_remote_code=True
	)

	model = PeftModel.from_pretrained(base_model, "scthornton/qwen-coder-7b-securecode")
	tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct", trust_remote_code=True)

	# Now runs on RTX 3090/4080!
	```

	### Code Review Use Case

	```python
	# Security audit of entire file
	code_to_review = open("app.py", "r").read()

	prompt = f"""### User:
	Perform a comprehensive security review of this application code. Identify all OWASP Top 10 vulnerabilities.

	```python
	{code_to_review}
	```

	### Assistant:
	"""

	inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=32768).to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=4096, temperature=0.3) # Lower temp for precise analysis
	review = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(review)
	```

	---

	## 🎯 Use Cases

	### 1. Automated Security Code Review
	Qwen's superior code understanding makes it ideal for reviewing complex codebases:
	```
	Analyze this 500-line authentication module for security vulnerabilities
	```

	### 2. Multi-File Security Analysis
	With 128K context, analyze entire projects:
	```
	Review these 3 related files for security issues: auth.py, middleware.py, models.py
	```

	### 3. Advanced Vulnerability Explanation
	Qwen excels at explaining complex attack chains:
	```
	Explain how an attacker could chain SSRF with authentication bypass in this microservices architecture
	```

	### 4. Production Security Architecture
	Get architectural security guidance:
	```
	Design a secure authentication system for a distributed microservices platform handling 100K requests/second
	```

	### 5. Multi-Language Security Refactoring
	Works across Qwen's 92 supported languages:
	```
	Refactor this Java Spring Boot controller to fix authentication vulnerabilities
	```

	---

	## ⚠️ Limitations

	### What This Model Does Well
	✅ Exceptional code understanding and completion
	✅ Multi-language security analysis (92 languages)
	✅ Large context window for file/project analysis
	✅ Detailed vulnerability explanations with examples
	✅ Complex attack chain analysis

	### What This Model Doesn't Do
	❌ Not a security scanner - Use tools like Semgrep, CodeQL, or Snyk
	❌ Not a penetration testing tool - Cannot perform active exploitation
	❌ Not legal/compliance advice - Consult security professionals
	❌ Not a replacement for security experts - Critical systems need professional review

	### Known Issues
	- May generate verbose responses (trained on detailed security explanations)
	- Best for common vulnerability patterns (OWASP Top 10) vs novel 0-days
	- Requires 16GB+ GPU for optimal performance (4-bit quantization)

	---

	## 📈 Performance Benchmarks

	### Hardware Requirements

	Minimum:
	- 16GB RAM
	- 12GB GPU VRAM (with 4-bit quantization)

	Recommended:
	- 32GB RAM
	- 16GB+ GPU (RTX 3090, A5000, etc.)

	Inference Speed (on RTX 3090 24GB):
	- ~40 tokens/second with 4-bit quantization
	- ~60 tokens/second with bfloat16 (full precision)

	### Code Generation Benchmarks (Base Qwen 2.5-Coder)

	\| Benchmark \| Score \| Rank \|
	\|-----------\|-------\|------\|
	\| HumanEval \| 88.2% \| #1 in 7B class \|
	\| MBPP \| 75.8% \| #1 in 7B class \|
	\| LiveCodeBench \| 35.1% \| Top 3 overall \|
	\| MultiPL-E \| 78.9% \| Best multi-language \|

	Security benchmarks coming soon - community contributions welcome!

	---

	## 🔬 Dataset Information

	This model was trained on [SecureCode v2.0](https://huggingface.co/datasets/scthornton/securecode-v2), a production-grade security dataset with:

	- 1,209 total examples (841 train / 175 validation / 193 test)
	- 100% incident grounding - every example tied to real CVEs or security breaches
	- 11 vulnerability categories - complete OWASP Top 10:2025 coverage
	- 11 programming languages - from Python to Rust
	- 4-turn conversational structure - mirrors real developer-AI workflows
	- 100% expert validation - reviewed by independent security professionals

	See the [full dataset card](https://huggingface.co/datasets/scthornton/securecode-v2) for complete details.

	---

	## 🏢 About perfecXion.ai

	[perfecXion.ai](https://perfecxion.ai) is dedicated to advancing AI security through research, datasets, and production-grade security tooling.

	Connect:
	- Website: [perfecxion.ai](https://perfecxion.ai)
	- Research: [perfecxion.ai/research](https://perfecxion.ai/research)
	- GitHub: [@scthornton](https://github.com/scthornton)
	- HuggingFace: [@scthornton](https://huggingface.co/scthornton)

	---

	## 📄 License

	Model License: Apache 2.0 (commercial use permitted)
	Dataset License: CC BY-NC-SA 4.0

	---

	## 📚 Citation

	```bibtex
	@misc{thornton2025securecode-qwen7b,
	title={Qwen 2.5-Coder 7B - SecureCode Edition},
	author={Thornton, Scott},
	year={2025},
	publisher={perfecXion.ai},
	url={https://huggingface.co/scthornton/qwen-coder-7b-securecode},
	note={Fine-tuned on SecureCode v2.0}
	}
	```

	---

	## 🙏 Acknowledgments

	- Alibaba Cloud & Qwen Team for the exceptional Qwen 2.5-Coder base model
	- OWASP Foundation for maintaining the Top 10 vulnerability taxonomy
	- MITRE Corporation for the CVE database
	- Hugging Face for infrastructure

	---

	## 🔗 Related Models in SecureCode Collection

	- [llama-3.2-3b-securecode](https://huggingface.co/scthornton/llama-3.2-3b-securecode) - Most accessible (3B)
	- [deepseek-coder-6.7b-securecode](https://huggingface.co/scthornton/deepseek-coder-6.7b-securecode) - Security-optimized (6.7B)
	- [codellama-13b-securecode](https://huggingface.co/scthornton/codellama-13b-securecode) - Established brand (13B)
	- [starcoder2-15b-securecode](https://huggingface.co/scthornton/starcoder2-15b-securecode) - Multi-language specialist (15B)

	View the complete collection: [SecureCode Models](https://huggingface.co/collections/scthornton/securecode)

	---

	<div align="center">

	Built with ❤️ for secure software development

	[perfecXion.ai](https://perfecxion.ai) \| [Research](https://perfecxion.ai/research) \| [Contact](mailto:scott@perfecxion.ai)

	</div>