arxiv:2505.22571

Agent-UniRAG: A Trainable Open-Source LLM Agent Framework for Unified Retrieval-Augmented Generation Systems

Published on May 28, 2025

Authors:

Abstract

A unified retrieval-augmented generation framework is proposed that enables single-hop and multi-hop query processing through an LLM agent architecture, demonstrating performance comparable to larger models on benchmark tasks.

AI-generated summary

This paper presents a novel approach for unified retrieval-augmented generation (RAG) systems using the recent emerging large language model (LLM) agent concept. Specifically, Agent LLM, which utilizes LLM as fundamental controllers, has become a promising approach to enable the interpretability of RAG tasks, especially for complex reasoning question-answering systems (e.g., multi-hop queries). Nonetheless, previous works mainly focus on solving RAG systems with either single-hop or multi-hop approaches separately, which limits the application of those approaches to real-world applications. In this study, we propose a trainable agent framework called Agent-UniRAG for unified retrieval-augmented LLM systems, which enhances the effectiveness and interpretability of RAG systems. The main idea is to design an LLM agent framework to solve RAG tasks step-by-step based on the complexity of the inputs, simultaneously including single-hop and multi-hop queries in an end-to-end manner. Furthermore, we introduce SynAgent-RAG, a synthetic dataset to enable the proposed agent framework for small open-source LLMs (e.g., Llama-3-8B). The results show comparable performances with closed-source and larger open-source LLMs across various RAG benchmarks. Our source code and dataset are publicly available for further exploitation.

View arXiv page View PDF Add to collection

Community

Austin8547

3 days ago

I am an MSc Data Science student at the University of Kerala, currently working on my final year project and research.

I recently read your paper, "Agent-UniRAG: A Trainable Open-Source LLM Agent Framework for Unified Retrieval-Augmented Generation Systems" , and I am very interested in your approach to unifying single-hop and multi-hop RAG tasks using LLM agents. Your work on the SynAgent-RAG dataset for fine-tuning smaller open-source models like Llama-3-8B is particularly relevant to my current studies.

In the paper, you mentioned that the source code and dataset are publicly available for further exploitation. Could you please provide the link to the official repository or instructions on how I might access these resources for my academic research?

Thank you for your time and for sharing your work with the research community.

Best regards,

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2505.22571 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.22571 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.22571 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.