MAAT: Multi-phase Adapter-Aware Targeted Unlearning (Gemma-3-4B Adapter)

This repository contains a LoRA adapter for google/gemma-3-4b-it fine-tuned using the MAAT (Multi-phase Adapter-Aware Targeted Unlearning) framework. This model was developed to address challenges in machine unlearning, particularly concerning "Why-type" questions that involve complex causal and relational knowledge.

Model Details

  • Developed by: Suryash Yagnik, Shubham Gaur, Saksham Thakur, Vinija Jain, Aman Chadha, Amitava Das
  • Model Type: LoRA Adapter
  • Base Model: google/gemma-3-4b-it
  • Dataset: Fine-tuned/evaluated on the Factify-5W (5WBENCH) dataset
  • License: Apache 2.0

Model Description

MAAT is a three-phase unlearning framework that operates exclusively on LoRA adapter weights. It aims to achieve high forgetting on specific targeted facts while maintaining high retention on other knowledge.

The framework consists of:

  1. Phase 1 (Gradient Policy Ascent): Uses orthogonally projected gradients to remove components that conflict with retained knowledge.
  2. Phase 2 (Structural Compression and Task Negation): Employs SVD profiling to prune rank dimensions associated with the forget set.
  3. Phase 3 (Multi-Objective Utility Repair Engine): A hybrid alignment loop to repair the utility of the retained knowledge.

Links

Citation

@article{yagnik2024maat,
  title={MAAT: Multi-phase Adapter-Aware Targeted Unlearning},
  author={Yagnik, Suryash and Gaur, Shubham and Thakur, Saksham and Jain, Vinija and Chadha, Aman and Das, Amitava},
  journal={arXiv preprint arXiv:2605.30514},
  year={2024}
}
Downloads last month
74
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Novaspree/tofu-Gemma3-adapter-1

Adapter
(382)
this model

Paper for Novaspree/tofu-Gemma3-adapter-1