Papers
arxiv:2602.02084

Closing the Loop: Universal Repository Representation with RPG-Encoder

Published on Feb 2
· Submitted by
Jane Luo
on Feb 3
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

RPG-Encoder framework transforms repository comprehension and generation into a unified cycle by encoding code into high-fidelity Repository Planning Graph representations that improve understanding and reconstruction accuracy.

AI-generated summary

Current repository agents encounter a reasoning disconnect due to fragmented representations, as existing methods rely on isolated API documentation or dependency graphs that lack semantic depth. We consider repository comprehension and generation to be inverse processes within a unified cycle: generation expands intent into implementation, while comprehension compresses implementation back into intent. To address this, we propose RPG-Encoder, a framework that generalizes the Repository Planning Graph (RPG) from a static generative blueprint into a unified, high-fidelity representation. RPG-Encoder closes the reasoning loop through three mechanisms: (1) Encoding raw code into the RPG that combines lifted semantic features with code dependencies; (2) Evolving the topology incrementally to decouple maintenance costs from repository scale, reducing overhead by 95.7%; and (3) Operating as a unified interface for structure-aware navigation. In evaluations, RPG-Encoder establishes state-of-the-art repository understanding on SWE-bench Verified with 93.7% Acc@5 and exceeds the best baseline by over 10% on SWE-bench Live Lite. These results highlight our superior fine-grained localization accuracy in complex codebases. Furthermore, it achieves 98.5% reconstruction coverage on RepoCraft, confirming RPG's high-fidelity capacity to mirror the original codebase and closing the loop between intent and implementation.

Community

Paper author Paper submitter

Current repository agents encounter a reasoning disconnect due to fragmented representations, as
existing methods rely on isolated API documentation or dependency graphs that lack semantic depth.
We consider repository comprehension and generation to be inverse processes within a unified cycle:
generation expands intent into implementation, while comprehension compresses implementation
back into intent. To address this, we propose RPG-Encoder, a framework that generalizes the
Repository Planning Graph (RPG) from a static generative blueprint into a unified, high-fidelity
representation. RPG-Encoder closes the reasoning loop through three mechanisms: (1) Encoding
raw code into the RPG that combines lifted semantic features with code dependencies; (2) Evolving
the topology incrementally to decouple maintenance costs from repository scale, reducing overhead
by 95.7%; and (3) Operating as a unified interface for structure-aware navigation. In evaluations,
RPG-Encoder establishes state-of-the-art repository understanding on SWE-bench Verified with 93.7%
Acc@5 and exceeds the best baseline by over 10% on SWE-bench Live Lite. These results highlight
our superior fine-grained localization accuracy in complex codebases. Furthermore, it achieves 98.5%
reconstruction coverage on RepoCraft, confirming RPG’s high-fidelity capacity to mirror the original
codebase and closing the loop between intent and implementation.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.02084 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.02084 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.02084 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.