The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs
Paper
• 2506.18403
• Published • 3
ReCode: Updating Code API Knowledge with Reinforcement Learning
Paper
• 2506.20495
• Published • 10
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper
• 2507.23348
• Published • 12
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex
Software Engineering
Paper
• 2509.09614
• Published • 7
LongCodeZip: Compress Long Context for Code Language Models
Paper
• 2510.00446
• Published • 108
CoDA: Coding LM via Diffusion Adaptation
Paper
• 2510.03270
• Published • 43
BigCodeArena: Unveiling More Reliable Human Preferences in Code
Generation via Execution
Paper
• 2510.08697
• Published • 39
A Survey of Vibe Coding with Large Language Models
Paper
• 2510.12399
• Published • 50
ReCode: Unify Plan and Action for Universal Granularity Control
Paper
• 2510.23564
• Published • 123
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual
Representation
Paper
• 2511.02778
• Published • 103
WebVIA: A Web-based Vision-Language Agentic Framework for Interactive and Verifiable UI-to-Code Generation
Paper
• 2511.06251
• Published • 14
Agentic Rubrics as Contextual Verifiers for SWE Agents
Paper
• 2601.04171
• Published • 13
MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences
Paper
• 2601.06789
• Published • 80
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development
Paper
• 2601.11077
• Published • 66
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper
• 2601.15892
• Published • 53
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
Paper
• 2601.11868
• Published • 34
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding
Paper
• 2602.01785
• Published • 96
SWE-Universe: Scale Real-World Verifiable Environments to Millions
Paper
• 2602.02361
• Published • 61
Code2World: A GUI World Model via Renderable Code Generation
Paper
• 2602.09856
• Published • 201
K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model
Paper
• 2602.19128
• Published • 7
Paper
• 2603.01896
• Published • 9
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale
Paper
• 2602.23866
• Published • 88
Qwen3-Coder-Next Technical Report
Paper
• 2603.00729
• Published • 57
Towards a Neural Debugger for Python
Paper
• 2603.09951
• Published • 5
CodePercept: Code-Grounded Visual STEM Perception for MLLMs
Paper
• 2603.10757
• Published • 13
InCoder-32B: Code Foundation Model for Industrial Scenarios
Paper
• 2603.16790
• Published • 286