YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
| π Paper | π€ Model | π€ Dataset | π₯οΈ Website |

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Dongming Wu, Yanping Fu, Saike Huang, Yingfei Liu, Fan Jia, Nian Liu, Feng Dai, Tiancai Wang, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jianbing Shen
π TL;DR
- To push forward general robotic grasping, we introduce a large-scale reasoning-based affordance segmentation benchmark, RAGNet. It contains 273k images, 180 categories, and 26k reasoning instructions.
- Furthermore, we propose a comprehensive affordance-based grasping framework, named AffordanceNet, which consists of a VLM (named AffordanceVLM) pre-trained on our massive affordance data and a grasping network that conditions an affordance map to grasp the target.
π° News
- [2025.08] Paper is released at arXiv.
- [2025.07] Inference code and the AffordanceVLM model are released. Welcome to try it!
- [2025.06] Paper is accepted by ICCV 2025!
π Getting Started
To deploy using Gradio, run the following command:
python app.py --version='./exps/AffordanceVLM-7B'
π Main Results
πΉ Affordance Segmentation
| Method | HANDAL gIoU | HANDAL cIoU | HANDALβ gIoU | HANDALβ cIoU | GraspNet seen gIoU | GraspNet seen cIoU | GraspNet novel gIoU | GraspNet novel cIoU | 3DOI gIoU | 3DOI cIoU |
|---|---|---|---|---|---|---|---|---|---|---|
| AffordanceNet | 60.3 | 60.8 | 60.5 | 60.3 | 63.3 | 64.0 | 45.6 | 33.2 | 37.4 | 37.4 |
πΈ Reasoning-Based Affordance Segmentation
| Method | HANDAL (easy) gIoU | HANDAL (easy) cIoU | HANDAL (hard) gIoU | HANDAL (hard) cIoU | 3DOI gIoU | 3DOI cIoU |
|---|---|---|---|---|---|---|
| AffordanceNet | 58.3 | 58.1 | 58.2 | 57.8 | 38.1 | 39.4 |
π Citation
If you find our work useful, please consider citing:
@inproceedings{wu2025ragnet,
title={RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping},
author={Wu, Dongming and Fu, Yanping and Huang, Saike and Liu, Yingfei and Jia, Fan and Liu, Nian and Dai, Feng and Wang, Tiancai and Anwer, Rao Muhammad and Khan, Fahad Shahbaz and others},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={11980--11990},
year={2025}
}
π Acknowledgements
We thank the authors that open the following projects.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support