hal-utokyo
/

MangaLMM

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

This repository contains the MangaLMM model described in the paper MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding.

Code: https://github.com/manga109/MangaLMM
Official demo: https://huggingface.co/spaces/yuki-imajuku/MangaLMM-Demo

Downloads last month: 506

Safetensors

Model size

8B params

Tensor type

BF16

·

Model tree for hal-utokyo/MangaLMM

Quantizations

Spaces using hal-utokyo/MangaLMM 2

Paper for hal-utokyo/MangaLMM

MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding

Paper • 2505.20298 • Published May 26, 2025 • 9