SalesCue — spam

SpamHead module from the SalesCue sales intelligence library.

Status: untrained — architecture only, random initialization. Use as a starting point for fine-tuning.

Research Contribution

Multi-Probe Hierarchical Bayesian Attention Gating with Adversarial Calibration

Spam signals are level-dependent: spammy tokens ('FREE') contribute differently than spammy sentence structures (urgency patterns) versus spammy document profiles (link density, header anomalies). SpamHead introduces a Hierarchical Bayesian Attention Gate (HBAG) with 4 aspect-specific probes (content, structure, deception, synthetic) that operate at token, sentence, and document level simultaneously. Each probe computes independent Beta(α,β) posteriors, blended via a learned gating mechanism. Per-sentence token-span encoding ensures true hierarchical processing — each sentence gets its own neural signal from its token span. An adversarial calibration loss forces provider-specific scores to match empirical inbox placement distributions. Uncertainty is decomposed into aleatoric (category entropy) and epistemic (Beta variance) components.

Sub-modules:

HierarchicalBayesianAttentionGate: 4-probe multi-aspect attention → per-sentence token-span aggregation (12 structural features) → document-level 7-category classification with uncertainty decomposition
AdversarialStyleTransferDetector: 32 information-theoretic features (Yule's K, Shannon entropy, Honoré's R, trigram repetition, perplexity ratio, trajectory smoothness, watermark detection per Kirchenbauer et al. 2023)
HeaderAnalyzer: SPF/DKIM/DMARC + routing analysis (16-dim feature vector)
TemporalBurstDetector: Cross-email send pattern analysis (Kleinberg burst model)
CampaignSimilarityDetector: Template detection via pairwise CLS cosine similarity with proper union-find clustering
ProviderCalibration: 6-provider deliverability (Gmail, Outlook, Yahoo, ProtonMail, Apple Mail, Corporate) with 10-feature input and adversarial calibration discriminator

7-category taxonomy: clean, template_spam, ai_generated, low_effort, role_account, domain_suspect, content_violation. Residual gate decision network with layer norm. Production path: DeBERTa model distills to 24-feature logistic regression weights loaded by a Rust SpamClassifier with SoA batch processing.

Usage

from salescue import SalesCueModel

model = SalesCueModel.from_pretrained("v9ai/salescue-spam-v1")
result = model.predict("your sales text here")
print(result)

Labels

clean
template_spam
ai_generated
low_effort
role_account
domain_suspect
content_violation

Architecture

Backbone: microsoft/deberta-v3-base (shared encoder, 768-dim)
Head: SpamHead
Parameters: head only (backbone loaded separately)

Intended Use

Primary: B2B sales intelligence — lead scoring, email analysis, conversation insights
Users: Sales teams, RevOps, GTM engineers building sales automation
Input: English sales text (emails, call transcripts, prospect communications)

Limitations

Untrained weights: This release contains the architecture only. Weights are randomly initialized and must be fine-tuned on domain-specific data before production use.
English only: Designed for English sales text. Performance on other languages is untested.
Domain-specific: Optimized for B2B sales communications. May not generalize to other text domains.
Shared backbone: Requires microsoft/deberta-v3-base loaded via the SalesCue library.

About SalesCue

SalesCue is a sales intelligence library with 12 ML modules sharing a single DeBERTa-v3-base encoder backbone. Modules can be composed via Unix-style piping:

from salescue import Document
result = Document("interested in pricing") | ai.score | ai.intent | ai.sentiment

All modules: score intent reply triggers icp objection sentiment spam entities call subject emailgen

See the SalesCue documentation for details.

Downloads last month: 3

Model tree for v9ai/salescue-spam-v1

Base model

microsoft/deberta-v3-base

Finetuned

(622)

this model