IntentRL-Ambig-Text2SQL-4B

This model is trained to handle ambiguous text-to-SQL requests by explicitly reasoning about user intent and producing multiple interpretation–answer pairs rather than silently committing to a single interpretation.

It is based on Qwen/Qwen3-4B-Instruct-2507, fine-tuned with RL (DAPO/GRPO) using a custom reward that encourages recall (covering more valid interpretations) for ambiguous questions and precision for unambiguous ones.

Example

Given a schema and an ambiguous question:

Schema: CREATE TABLE Jobs (JobID INTEGER PRIMARY KEY, Min_Years INTEGER, Pref_Years INTEGER, Position TEXT, Salary REAL);

Question: Show the required experience for the best-paid role.

The model produces multiple interpretation–answer pairs:

  1. Minimum years of experience requiredSELECT Min_Years ...
  2. Preferred years of experienceSELECT Pref_Years ...
  3. Both minimum and preferred yearsSELECT Min_Years, Pref_Years ...

Paper

Reasoning about Intent for Ambiguous Requests

Authors: Irina Saparina, Mirella Lapata

Training Details

  • Base model: Qwen3-4B-Instruct-2507
  • Method: RL with DAPO/GRPO and a custom recall/precision reward
  • Training data: Ambrosia text-to-SQL benchmark
  • Ambiguous examples are upsampled to balance training

Code

Training and evaluation code: https://github.com/saparina/intentRL

Citation

@misc{saparina2025reasoningintentambiguousrequests,
      title={Reasoning about Intent for Ambiguous Requests},
      author={Irina Saparina and Mirella Lapata},
      year={2025},
      eprint={2511.10453},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2511.10453},
}
Downloads last month
6
Safetensors
Model size
4B params
Tensor type
BF16
·
Video Preview
loading

Model tree for irisaparina/IntentRL-Ambig-Text2SQL-4B

Finetuned
(969)
this model
Quantizations
2 models

Paper for irisaparina/IntentRL-Ambig-Text2SQL-4B