TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas Paper โข 2603.16448 โข Published 2 days ago โข 50
Running 89 Unlocking On-Policy Distillation for Any Model Family ๐ 89 Visualize on-policy distillation for any model family
When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning Paper โข 2505.15400 โข Published May 21, 2025 โข 23
When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning Paper โข 2505.15400 โข Published May 21, 2025 โข 23