| --- |
| license: apache-2.0 |
| --- |
| # π PyJavaCPP-Vuln-Fixer |
|
|
| PyJavaCPP-Vuln-Fixer is a security-focused code repair model fine-tuned to automatically fix vulnerabilities in: |
|
|
| - Python |
| - Java |
| - C++ |
|
|
| The model is built on Qwen2.5-Coder-1.5B-Instruct and fine-tuned using LoRA for automated vulnerability remediation. |
|
|
| It takes vulnerable source code as input and outputs only the fixed, secure version of the code. |
|
|
| ## π Quick Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| import torch |
| |
| model_id = "jugalgajjar/PyJavaCPP-Vuln-Fixer" |
| |
| tokenizer = AutoTokenizer.from_pretrained(model_id) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| torch_dtype=torch.float16, |
| device_map="auto", |
| ) |
| |
| SYSTEM_MESSAGE = ( |
| "You are a code security expert. Given vulnerable source code, " |
| "output ONLY the fixed version of the code with the vulnerability repaired. " |
| "Do not include explanations, just the corrected code." |
| ) |
| |
| language = "python" |
| vulnerable_code = """import os |
| from flask import Flask, request |
| |
| app = Flask(__name__) |
| |
| @app.route("/run") |
| def run(): |
| cmd = request.args.get("cmd") |
| return os.popen(cmd).read() |
| |
| if __name__ == "__main__": |
| app.run()""" |
| |
| messages = [ |
| {"role": "system", "content": SYSTEM_MESSAGE}, |
| {"role": "user", "content": f"Fix the below given vulnerable {language} code:\n{vulnerable_code}"}, |
| ] |
| |
| prompt = tokenizer.apply_chat_template( |
| messages, |
| tokenize=False, |
| add_generation_prompt=True, |
| ) |
| |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| |
| with torch.no_grad(): |
| outputs = model.generate( |
| **inputs, |
| max_new_tokens=1024, |
| temperature=0.2, |
| top_p=0.95, |
| do_sample=True, |
| repetition_penalty=1.15, |
| ) |
| |
| new_tokens = outputs[0][inputs["input_ids"].shape[1]:] |
| print(tokenizer.decode(new_tokens, skip_special_tokens=True)) |
| ``` |
|
|
| ## π― Intended Use |
|
|
| - Automated vulnerability remediation |
| - Secure code refactoring |
| - Research in AI-assisted program repair |
| - Secure CI/CD integration |