AEGIS-FIN-1
A domain-specialized financial AI assistant fine-tuned for safe, structured transaction reasoning.
Model Description
AEGIS-FIN-1 is a Mistral-7B-Instruct-v0.3 model fine-tuned with QLoRA for financial transaction classification, bill negotiation, credit optimization, and BNPL (Buy Now Pay Later) decision support. It operates as the intelligence layer within the AEGIS Heimdall safety pipeline.
| Property | Value |
|---|---|
| Base Model | mistralai/Mistral-7B-Instruct-v0.3 |
| Fine-tuning Method | QLoRA (4-bit NF4 quantization) |
| LoRA Rank | r=16, alpha=32 |
| Training Epochs | 3 |
| Training Data | ~100,000 synthetic financial queries |
| Quantization | 4-bit NF4 with double quantization |
Intended Use
AEGIS-FIN-1 is designed to:
- Classify user financial intents (purchase, bill negotiation, credit optimization, BNPL, etc.)
- Generate structured JSON responses for downstream pipeline processing
- Operate within a multi-stage safety pipeline with input validation, injection detection, and compliance checks
Out-of-Scope Uses
- General-purpose chatbot or conversation
- Medical, legal, or non-financial advice
- Direct financial transactions (model recommends, does not execute)
- Use outside a safety pipeline (model assumes upstream safety filtering)
Training Data
Training data was generated via a synthetic data distillation pipeline:
- Query Generation: 225+ natural language templates across 30+ financial intent categories
- User Profiles: 5,000 synthetic user profiles calibrated against the U.S. Federal Reserve Survey of Consumer Finances (SCF 2022)
- Teacher Distillation: Each query processed through a frontier model to generate structured JSON responses
- Topics Covered: Credit cards, BNPL, bill negotiation, subscriptions, budgeting, credit building, gig worker finance, tax basics, insurance, side hustles, fraud detection
Data Diversity
- Formal and informal language styles
- Income brackets: $15K–$200K+ (SCF-calibrated distribution)
- Credit tiers: Poor (300-579) through Excellent (800-850)
- No real user data was used at any stage
Training Procedure
| Parameter | Value |
|---|---|
| Framework | Hugging Face transformers + PEFT + TRL |
| Method | SFTTrainer with QLoRA |
| Hardware | NVIDIA A40 (48GB) |
| Training Time | ~2 hours |
| Effective Batch Size | 16 (4 × 4 gradient accumulation) |
| Learning Rate | 2e-4 (cosine scheduler) |
Training Results
| Metric | Value |
|---|---|
| Intent Classification Accuracy | 94.5% |
| Final Training Loss | 0.287 |
Safety Pipeline Integration
AEGIS-FIN-1 does not operate in isolation. It is embedded within a multi-stage safety pipeline:
Input → Content Classifier → Injection Detector → Topic Enforcer
→ PII Detector → Model Inference → Compliance Validator
→ Predictive Risk Simulator → Output
Safety Evaluation (Pipeline-Level)
Evaluated on a benchmark of 74 adversarial + 20 legitimate financial queries across 13 attack categories.
| Metric | Main Test Set | Held-Out (Unseen Attacks) |
|---|---|---|
| F1 Score | 0.851 | 0.148 |
| Precision | 0.974 | 1.000 |
| Recall | 0.755 | 0.080 |
| False Positive Rate | 0.067 | 0.000 |
Confusion Matrix (Main Set)
| Predicted: Block | Predicted: Pass | |
|---|---|---|
| Actually Adversarial | TP: 37 | FN: 12 |
| Actually Legitimate | FP: 1 | TN: 14 |
Ablation Study
Each safety component was disabled independently to measure its contribution:
| Configuration | F1 | ΔF1 |
|---|---|---|
| Full Pipeline | 0.851 | — |
| − Content Classifier | 0.571 | −0.280 |
| − PII Detector | 0.750 | −0.101 |
| − Injection Detector | 0.780 | −0.071 |
| − Topic Enforcer | 0.790 | −0.061 |
Latency
Safety pipeline overhead per query (regex mode):
| p50 | p95 | p99 |
|---|---|---|
| 0.10ms | 0.23ms | 3.97ms |
Limitations
Known Limitations
- Held-out recall is low (8%): The regex-based safety components generalize poorly to novel attack patterns not seen during development. ML-based classifiers (supported by the architecture) would improve this significantly.
- US-centric: Training data is calibrated against U.S. Federal Reserve data. Financial recommendations may not apply to other regulatory jurisdictions.
- Synthetic training data only: No real user interactions were used, which may limit edge-case coverage.
- LoRA adapter only: This release provides adapter weights only. Requires the base
Mistral-7B-Instruct-v0.3model for inference.
Known Biases
- Income distribution follows U.S. SCF 2022 statistics — may underrepresent very high and very low income brackets
- English-only: No multilingual support
- Financial products are U.S.-market focused (credit scores, BNPL, 401k)
Ethical Considerations
- No real PII: Model was trained entirely on synthetic data; no real user data was used
- PII detection: The safety pipeline actively detects and blocks PII (SSN, credit card numbers) in user inputs before they reach the model
- No execution: Model recommends financial actions but never executes transactions
- AML compliance: BSA/AML evasion attempts are flagged and blocked upstream
- Audit trail: All interactions are logged in a hash-chained audit ledger for compliance and accountability
Citation
@misc{aegis-fin-1-2026,
title={AEGIS-FIN-1: Domain-Specialized Financial AI with Multi-Stage Safety Pipeline},
author={AEGIS Heimdall Team},
year={2026},
}
- Downloads last month
- 155
Model tree for aegisheimdall/AEGIS-FIN-1
Base model
mistralai/Mistral-7B-v0.3