arxiv:2606.24112

ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

Published on Jun 23

· Submitted by

Chenhao Dang on Jun 24

OpenDataLab

Upvote

Authors:

Abstract

A comprehensive multimodal misinformation detection framework is introduced that handles complex, multilingual content with multiple images and diverse verification approaches, achieving superior performance while reducing computational costs.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Multimodal misinformation detection is increasingly important because viral posts now combine long multilingual narratives, several images, mixed provenance, and subtle text--image framing errors. Existing benchmarks and methods remain poorly matched to this setting: they usually isolate short captions, single images, binary labels, or one manipulation source, while agentic verification remains costly under realistic evidence search. We present ReMMD, a realistic multilingual multi-image agentic verification framework for multimodal misinformation detection. ReMMD includes ReMMDBench, a real-world multimodal misinformation detection benchmark with 500 samples, 2,756 images, five monolingual languages, two cross-lingual settings, three text-length tiers, multi-image posts, five-way veracity labels, eight distortion labels, evidence provenance, and rationales. It also includes ReMMD-Agent, a persistent-memory verifier that decomposes posts into atomic points, builds a reusable evidence set, and predicts structured L1/L2/L3 outputs. Across proprietary systems, open LVLMs, MMD-Agent, and T2-Agent, ReMMD-Agent obtains the best five-way veracity performance, with 41.80% accuracy and 39.12% macro-F1 using GPT-5.2, while reducing cost by 17.5% relative to MMD-Agent and 79.9% relative to T2-Agent. The project is available at https://dang-ai.github.io/ReMMD.

View arXiv page View PDF Project page GitHub 0 Add to collection

Community

DDAI-D

Paper submitter about 14 hours ago

We introduce ReMMD, a realistic multilingual multi-image agentic verification framework for multimodal misinformation detection. ReMMDBench contains 500 real-world samples with 2,756 images, five monolingual languages, two cross-lingual settings, three text-length tiers, five-way veracity labels, eight distortion labels, evidence provenance, and rationales, targeting the gap between existing simplified MMD benchmarks and real-world fact-checking scenarios. We also propose ReMMD-Agent, a persistent-memory verifier that decomposes posts into atomic claims and image bindings, reuses retrieved evidence, and produces structured veracity, distortion, and rationale outputs. Experiments across commercial agents and open-source MMD agents show that ReMMD-Agent achieves the strongest overall performance while substantially reducing verification cost. Project page: https://dang-ai.github.io/ReMMD

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2606.24112

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.24112 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.24112 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.