Papers
arxiv:2602.11144

GENIUS: Generative Fluid Intelligence Evaluation Suite

Published on Feb 11
· Submitted by
li haodong
on Feb 12
#2 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

GENIUS evaluates multimodal models' generative fluid intelligence through pattern induction, constraint execution, and contextual adaptation tasks, revealing deficiencies in context comprehension rather than generative capability.

AI-generated summary

Unified Multimodal Models (UMMs) have shown remarkable progress in visual generation. Yet, existing benchmarks predominantly assess Crystallized Intelligence, which relies on recalling accumulated knowledge and learned schemas. This focus overlooks Generative Fluid Intelligence (GFI): the capacity to induce patterns, reason through constraints, and adapt to novel scenarios on the fly. To rigorously assess this capability, we introduce GENIUS (GEN Fluid Intelligence EvalUation Suite). We formalize GFI as a synthesis of three primitives. These include Inducing Implicit Patterns (e.g., inferring personalized visual preferences), Executing Ad-hoc Constraints (e.g., visualizing abstract metaphors), and Adapting to Contextual Knowledge (e.g., simulating counter-intuitive physics). Collectively, these primitives challenge models to solve problems grounded entirely in the immediate context. Our systematic evaluation of 12 representative models reveals significant performance deficits in these tasks. Crucially, our diagnostic analysis disentangles these failure modes. It demonstrates that deficits stem from limited context comprehension rather than insufficient intrinsic generative capability. To bridge this gap, we propose a training-free attention intervention strategy. Ultimately, GENIUS establishes a rigorous standard for GFI, guiding the field beyond knowledge utilization toward dynamic, general-purpose reasoning. Our dataset and code will be released at: https://github.com/arctanxarc/GENIUS{https://github.com/arctanxarc/GENIUS}.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.11144 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.11144 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.11144 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.