Joakim Lee

Reinforcement4All

·

AI & ML interests

None yet

Organizations

None yet

upvoted 20 papers 3 months ago

Three-Phase Transformer

Paper • 2604.14430 • Published Apr 15 • 4

Model Capability Dominates: Inference-Time Optimization Lessons from AIMO 3

Paper • 2603.27844 • Published Apr 16 • 3

Towards Autonomous Mechanistic Reasoning in Virtual Cells

Paper • 2604.11661 • Published Apr 14 • 6

SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems

Paper • 2604.04514 • Published Apr 6 • 7

Reinforcement Learning via Value Gradient Flow

Paper • 2604.14265 • Published Apr 15 • 7

MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

Paper • 2604.15309 • Published Apr 16 • 8

RadAgent: A tool-using AI agent for stepwise interpretation of chest computed tomography

Paper • 2604.15231 • Published Apr 16 • 6

LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning

Paper • 2604.14922 • Published Apr 16 • 7

OneHOI: Unifying Human-Object Interaction Generation and Editing

Paper • 2604.14062 • Published Apr 15 • 8

Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

Paper • 2604.14228 • Published Apr 14 • 25

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Paper • 2604.14683 • Published Apr 16 • 36

RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

Paper • 2604.15308 • Published Apr 16 • 29

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published Apr 15 • 127

InfiniteScienceGym: An Unbounded, Procedurally-Generated Benchmark for Scientific Analysis

Paper • 2604.13201 • Published Apr 14 • 2

MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments

Paper • 2604.13418 • Published Apr 15 • 6

SemaClaw: A Step Towards General-Purpose Personal AI Agents through Harness Engineering

Paper • 2604.11548 • Published Apr 13 • 22

Target Policy Optimization

Paper • 2604.06159 • Published Apr 7 • 23

From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published Apr 15 • 30

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

Paper • 2604.10866 • Published Apr 13 • 69

SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

Paper • 2604.14144 • Published Apr 15 • 63