MAAT: Multi-phase Adapter-Aware Targeted Unlearning

This repository contains a LoRA adapter for Llama-3.2-3B, fine-tuned using the MAAT (Multi-phase Adapter-Aware Targeted Unlearning) framework.

MAAT is a three-phase unlearning framework designed to address the structural skew in machine unlearning evaluation, particularly focusing on "Why-type" questions that probe causal and relational knowledge. The method operates exclusively on LoRA adapter weights, combining gradient-projected ascent, SVD rank-dimension pruning, task vector negation, and hybrid KL-hidden-state retain repair.

Model Details

  • Developed by: Suryash Yagnik, Shubham Gaur, Saksham Thakur, Vinija Jain, Aman Chadha, Amitava Das
  • Model type: LoRA adapter
  • Base model: meta-llama/Llama-3.2-3B
  • Method: Multi-phase Adapter-Aware Targeted Unlearning (MAAT)

Summary

The MAAT framework establishes a new operating point on the forget-retain Pareto frontier. It achieves high forgetting and high retention on causal knowledge by:

  1. Gradient Policy Ascent: Using orthogonal projection to remove retain components from the forget gradient.
  2. Structural Compression: Pruning rank dimensions via SVD profiling.
  3. Utility Repair: Applying a multi-objective engine to maintain performance on the retain set.

Citation

@article{yagnik2026maat,
  title={MAAT: Multi-phase Adapter-Aware Targeted Unlearning},
  author={Yagnik, Suryash and Gaur, Shubham and Thakur, Saksham and Jain, Vinija and Chadha, Aman and Das, Amitava},
  journal={arXiv preprint arXiv:2605.30514},
  year={2026}
}
Downloads last month
101
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Novaspree/llama-3.2-3B-tofu-adapter

Adapter
(277)
this model

Paper for Novaspree/llama-3.2-3B-tofu-adapter