arxiv:2602.06540

AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

Published on Feb 6

· Submitted by

cwt on Feb 10

OpenBMB

Upvote

Authors:

Wentong Chen ,

Mingwei Li ,

Abstract

AgentCPM-Report presents a lightweight local solution for deep research report generation using a Writing As Reasoning Policy framework and multi-stage agentic training to enhance small models' reasoning and outline evolution capabilities.

AI-generated summary

Generating deep research reports requires large-scale information acquisition and the synthesis of insight-driven analysis, posing a significant challenge for current language models. Most existing approaches follow a plan-then-write paradigm, whose performance heavily depends on the quality of the initial outline. However, constructing a comprehensive outline itself demands strong reasoning ability, causing current deep research systems to rely almost exclusively on closed-source or online large models. This reliance raises practical barriers to deployment and introduces safety and privacy concerns for user-authored data. In this work, we present AgentCPM-Report, a lightweight yet high-performing local solution composed of a framework that mirrors the human writing process and an 8B-parameter deep research agent. Our framework uses a Writing As Reasoning Policy (WARP), which enables models to dynamically revise outlines during report generation. Under this policy, the agent alternates between Evidence-Based Drafting and Reasoning-Driven Deepening, jointly supporting information acquisition, knowledge refinement, and iterative outline evolution. To effectively equip small models with this capability, we introduce a Multi-Stage Agentic Training strategy, consisting of cold-start, atomic skill RL, and holistic pipeline RL. Experiments on DeepResearch Bench, DeepConsult, and DeepResearch Gym demonstrate that AgentCPM-Report outperforms leading closed-source systems, with substantial gains in Insight.

View arXiv page View PDF GitHub 729 Add to collection

Community

yiye2023

Paper author Paper submitter 1 day ago

•

edited 1 day ago

AgentCPM-Report是由THUNLP、中国人民大学RUCBM和ModelBest联合开发的开源大语言模型智能体。它基于MiniCPM4.1 80亿参数基座模型，接受用户指令作为输入，自主生成长篇报告。其有以下亮点：

极致效能，以小博大：通过平均40轮的深度检索与近100轮的思维链推演，实现对信息的全方位挖掘与重组，让端侧模型也能产出逻辑严密、洞察深刻的万字长文，在深度调研任务上以8B参数规模达成与顶级闭源系统的性能对标。
物理隔绝，本地安全：专为高隐私场景设计，支持完全离线的本地化敏捷部署，彻底杜绝云端泄密风险。基于我们的 UltraRAG 框架，它能高效挂载并理解您的本地私有知识库，让核心机密数据在“不出域”的前提下，安全地转化为极具价值的专业决策报告。

GitHub：https://github.com/OpenBMB/AgentCPM
Huggingface：https://huggingface.co/openbmb/AgentCPM-Report
ModelScope：https://modelscope.cn/models/OpenBMB/AgentCPM-Report