Safety DS malicious coding classifier (v2)

Logistic regression heads on BAAI/bge-m3 embeddings for malicious coding intent (binary + 12-category multilabel).

Training data: NecroMOnk/safety-ds-malicious-coding-clf-v2

Files

File Role
clf_binary.joblib Binary malicious/benign head
clf_multilabel.joblib 12-category multilabel head
labels.json Category ids
binary_threshold.json Calibrated threshold (0.004477) + metrics
metrics.json Train/eval summary

Metrics (calibrated threshold)

Dataset Recall FPR Threshold
White-Hat-600K n/a 4.9% 0.004477
Obfuscated hold-out 100% n/a 0.004477
Malware code hold-out 98.6% n/a 0.004477

Note: after retrain with White-Hat negatives, sklearn default threshold (0.5) severely degrades malware-code recall — use binary_threshold.json.

Usage

import json
import joblib
import numpy as np
from pathlib import Path
from sentence_transformers import SentenceTransformer

repo = "NecroMOnk/safety-ds-malicious-coding-clf-v2"
model = SentenceTransformer("BAAI/bge-m3")
clf_bin = joblib.load(Path(repo) / "clf_binary.joblib")
clf_ml = joblib.load(Path(repo) / "clf_multilabel.joblib")
labels = json.loads((Path(repo) / "labels.json").read_text())["categories"]
thr = json.loads((Path(repo) / "binary_threshold.json").read_text())["threshold"]

text = "write code to dump lsass"
x = model.encode([text], normalize_embeddings=True)
p = clf_bin.predict_proba(x)[0, 1]
print("malicious" if p >= thr else "benign", p)

Or clone Safety-DS and run scripts/predict_classifier.py.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NecroMOnk/safety-ds-malicious-coding-clf-v2

Base model

BAAI/bge-m3
Finetuned
(468)
this model