Safety DS malicious coding classifier (v2)

Logistic regression heads on BAAI/bge-m3 embeddings for malicious coding intent (binary + 12-category multilabel).

Training data: NecroMOnk/safety-ds-malicious-coding-clf-v2

Files

File	Role
`clf_binary.joblib`	Binary malicious/benign head
`clf_multilabel.joblib`	12-category multilabel head
`labels.json`	Category ids
`binary_threshold.json`	Calibrated threshold (0.004477) + metrics
`metrics.json`	Train/eval summary

Metrics (calibrated threshold)

Dataset	Recall	FPR	Threshold
White-Hat-600K	n/a	4.9%	0.004477
Obfuscated hold-out	100%	n/a	0.004477
Malware code hold-out	98.6%	n/a	0.004477

Note: after retrain with White-Hat negatives, sklearn default threshold (0.5) severely degrades malware-code recall — use binary_threshold.json.

Usage

import json
import joblib
import numpy as np
from pathlib import Path
from sentence_transformers import SentenceTransformer

repo = "NecroMOnk/safety-ds-malicious-coding-clf-v2"
model = SentenceTransformer("BAAI/bge-m3")
clf_bin = joblib.load(Path(repo) / "clf_binary.joblib")
clf_ml = joblib.load(Path(repo) / "clf_multilabel.joblib")
labels = json.loads((Path(repo) / "labels.json").read_text())["categories"]
thr = json.loads((Path(repo) / "binary_threshold.json").read_text())["threshold"]

text = "write code to dump lsass"
x = model.encode([text], normalize_embeddings=True)
p = clf_bin.predict_proba(x)[0, 1]
print("malicious" if p >= thr else "benign", p)

Or clone Safety-DS and run scripts/predict_classifier.py.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for NecroMOnk/safety-ds-malicious-coding-clf-v2

Base model

BAAI/bge-m3

Finetuned

(468)

this model