Dipan Maity's picture

114 1

Dipan Maity

DipanM2

ryyzn9

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

upvoted a paper 2 days ago

E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

upvoted a paper 2 days ago

MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE

View all activity

Organizations

None yet

upvoted 20 papers 2 days ago

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Paper • 2510.20150 • Published Oct 23, 2025 • 6

E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

Paper • 2601.00423 • Published Jan 1 • 11

MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE

Paper • 2507.21802 • Published Jul 29, 2025 • 19

TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization

Paper • 2601.16480 • Published 28 days ago • 52

Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88

GRPO++: Enhancing Dermatological Reasoning under Low Resource Settings

Paper • 2510.01236 • Published Sep 23, 2025 • 1

GRPO is Secretly a Process Reward Model

Paper • 2509.21154 • Published Sep 25, 2025 • 1

Self-Generated Critiques Boost Reward Modeling for Language Models

Paper • 2411.16646 • Published Nov 25, 2024 • 1

Reward Modeling from Natural Language Human Feedback

Paper • 2601.07349 • Published Jan 12 • 2

OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

Paper • 2510.07743 • Published Oct 9, 2025 • 11

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5, 2025 • 81

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

Paper • 2511.20102 • Published Nov 25, 2025 • 28

O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents

Paper • 2511.13593 • Published Nov 17, 2025 • 27

CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning

Paper • 2511.18659 • Published Nov 24, 2025 • 24

Pillar-0: A New Frontier for Radiology Foundation Models

Paper • 2511.17803 • Published Nov 21, 2025 • 24

NVIDIA Nemotron Parse 1.1

Paper • 2511.20478 • Published Nov 25, 2025 • 23

Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO

Paper • 2511.13288 • Published Nov 17, 2025 • 19

Monet: Reasoning in Latent Visual Space Beyond Images and Language

Paper • 2511.21395 • Published Nov 26, 2025 • 18

Fara-7B: An Efficient Agentic Model for Computer Use

Paper • 2511.19663 • Published Nov 24, 2025 • 15

Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs

Paper • 2511.19773 • Published Nov 24, 2025 • 10