Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
AIML-TUDA 's Collections
Reward Hacking in Reasoning Models
Scalable Logical Reasoning
How to Train your Text‑to‑Image Model
LlavaGuard

Reward Hacking in Reasoning Models

updated 4 days ago

Do reasoning LLMs actually reason — or learn to game the test? IPT allows for detecting reward hacking in inductive programming tasks (SLR-Bench).

Upvote
1

  • Running
    Agents
    1

    Isomorphic Perturbation Testing

    🔍
    1

    Evaluate rule hypotheses for genuine reasoning vs shortcuts


  • AIML-TUDA/SLR-Bench

    Viewer • Updated about 1 hour ago • 38.5k • 1.42k • 4

  • Running
    Agents
    1

    SLR-Bench Leaderboard - Reward Hacking in Reasoning Models

    🎯
    1

    Reward shortcut behavior in LLMs via IPT


  • LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking

    Paper • 2604.15149 • Published Apr 16 • 1
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs