ReNIO: Reweighting Negative Trajectory Importance for LLM On-Policy Distillation Paper • 2606.23104 • Published 13 days ago • 5
TreeSeeker: Tree-Structured Trial, Error, and Return in Deep Search Paper • 2606.11662 • Published 25 days ago • 10
POISE: Position-Aware Undetectable Skill Injection on LLM Agents Paper • 2606.07943 • Published 29 days ago • 4
Running 115 Unlocking On-Policy Distillation for Any Model Family 📝 115 Explore on-policy distillation visualization for any model
view article Article Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models ServiceNow-AI • Nov 19, 2025 • 34
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents Paper • 2510.14967 • Published Oct 16, 2025 • 34