Papers
arxiv:2602.06540

AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

Published on Feb 6
· Submitted by
cwt
on Feb 10
Authors:
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

AgentCPM-Report presents a lightweight local solution for deep research report generation using a Writing As Reasoning Policy framework and multi-stage agentic training to enhance small models' reasoning and outline evolution capabilities.

AI-generated summary

Generating deep research reports requires large-scale information acquisition and the synthesis of insight-driven analysis, posing a significant challenge for current language models. Most existing approaches follow a plan-then-write paradigm, whose performance heavily depends on the quality of the initial outline. However, constructing a comprehensive outline itself demands strong reasoning ability, causing current deep research systems to rely almost exclusively on closed-source or online large models. This reliance raises practical barriers to deployment and introduces safety and privacy concerns for user-authored data. In this work, we present AgentCPM-Report, a lightweight yet high-performing local solution composed of a framework that mirrors the human writing process and an 8B-parameter deep research agent. Our framework uses a Writing As Reasoning Policy (WARP), which enables models to dynamically revise outlines during report generation. Under this policy, the agent alternates between Evidence-Based Drafting and Reasoning-Driven Deepening, jointly supporting information acquisition, knowledge refinement, and iterative outline evolution. To effectively equip small models with this capability, we introduce a Multi-Stage Agentic Training strategy, consisting of cold-start, atomic skill RL, and holistic pipeline RL. Experiments on DeepResearch Bench, DeepConsult, and DeepResearch Gym demonstrate that AgentCPM-Report outperforms leading closed-source systems, with substantial gains in Insight.

Community

Paper author Paper submitter
edited 1 day ago

AgentCPM-Report是由THUNLP、中国人民大学RUCBMModelBest联合开发的开源大语言模型智能体。它基于MiniCPM4.1 80亿参数基座模型,接受用户指令作为输入,自主生成长篇报告。其有以下亮点:

  • 极致效能,以小博大:通过平均40轮的深度检索与近100轮的思维链推演,实现对信息的全方位挖掘与重组,让端侧模型也能产出逻辑严密、洞察深刻的万字长文,在深度调研任务上以8B参数规模达成与顶级闭源系统的性能对标。
  • 物理隔绝,本地安全:专为高隐私场景设计,支持完全离线的本地化敏捷部署,彻底杜绝云端泄密风险。基于我们的 UltraRAG 框架,它能高效挂载并理解您的本地私有知识库,让核心机密数据在“不出域”的前提下,安全地转化为极具价值的专业决策报告。

GitHub:https://github.com/OpenBMB/AgentCPM
Huggingface:https://huggingface.co/openbmb/AgentCPM-Report
ModelScope:https://modelscope.cn/models/OpenBMB/AgentCPM-Report

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.06540 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.06540 in a Space README.md to link it from this page.

Collections including this paper 1