IntentRL-Ambig-Text2SQL-4B
This model is trained to handle ambiguous text-to-SQL requests by explicitly reasoning about user intent and producing multiple interpretation–answer pairs rather than silently committing to a single interpretation.
It is based on Qwen/Qwen3-4B-Instruct-2507, fine-tuned with RL (DAPO/GRPO) using a custom reward that encourages recall (covering more valid interpretations) for ambiguous questions and precision for unambiguous ones.
Example
Given a schema and an ambiguous question:
Schema:
CREATE TABLE Jobs (JobID INTEGER PRIMARY KEY, Min_Years INTEGER, Pref_Years INTEGER, Position TEXT, Salary REAL);Question: Show the required experience for the best-paid role.
The model produces multiple interpretation–answer pairs:
- Minimum years of experience required →
SELECT Min_Years ... - Preferred years of experience →
SELECT Pref_Years ... - Both minimum and preferred years →
SELECT Min_Years, Pref_Years ...
Paper
Reasoning about Intent for Ambiguous Requests
Authors: Irina Saparina, Mirella Lapata
Training Details
- Base model: Qwen3-4B-Instruct-2507
- Method: RL with DAPO/GRPO and a custom recall/precision reward
- Training data: Ambrosia text-to-SQL benchmark
- Ambiguous examples are upsampled to balance training
Code
Training and evaluation code: https://github.com/saparina/intentRL
Citation
@misc{saparina2025reasoningintentambiguousrequests,
title={Reasoning about Intent for Ambiguous Requests},
author={Irina Saparina and Mirella Lapata},
year={2025},
eprint={2511.10453},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2511.10453},
}
- Downloads last month
- 6