Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 4 days ago • 9
VisionCoach: Reinforcing Grounded Video Reasoning via Visual-Perception Prompting Paper • 2603.14659 • Published 3 days ago • 5
OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism Paper • 2603.14371 • Published 3 days ago • 4
Mind the Shift: Decoding Monetary Policy Stance from FOMC Statements with Large Language Models Paper • 2603.14313 • Published 3 days ago • 3
VoXtream2: Full-stream TTS with dynamic speaking rate control Paper • 2603.13518 • Published 5 days ago • 1
WTF GENIUS PAPERS Collection Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 70 items • Updated 1 day ago • 9
TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning Paper • 2603.12529 • Published 6 days ago • 18 • 3
TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning Paper • 2603.12529 • Published 6 days ago • 18
Training-free Detection of Generated Videos via Spatial-Temporal Likelihoods Paper • 2603.15026 • Published 2 days ago • 8
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning Paper • 2603.15611 • Published 2 days ago • 9
MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos Paper • 2603.14145 • Published 4 days ago • 9
Learning Latent Proxies for Controllable Single-Image Relighting Paper • 2603.15555 • Published 2 days ago • 8
Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty Paper • 2603.15500 • Published 2 days ago • 11
Safe and Scalable Web Agent Learning via Recreated Websites Paper • 2603.10505 • Published 7 days ago • 21