11 41

Koi

KOIIIII

AI & ML interests

None yet

Recent Activity

upvoted a paper 25 days ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

upvoted a paper 25 days ago

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

liked a dataset about 1 month ago

datatune/LogiHard-2K

View all activity

Organizations

None yet

upvoted 2 papers 25 days ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

Paper • 2605.13779 • Published 26 days ago • 219

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Paper • 2605.13724 • Published 26 days ago • 101

upvoted a paper about 2 months ago

Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator

Paper • 2604.08121 • Published Apr 9 • 43

upvoted a paper 2 months ago

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Paper • 2604.01658 • Published Apr 2 • 55

upvoted a paper 6 months ago

PIPPA: A Partially Synthetic Conversational Dataset

Paper • 2308.05884 • Published Aug 11, 2023 • 34

upvoted a collection 7 months ago

InternVL3.5

Collection

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated Mar 2 • 109

upvoted an article 11 months ago

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.14k

upvoted an article about 1 year ago

Article

The N Implementation Details of RLHF with PPO

vwxyzjn, tianlinliu0121, lvwerra

•

Oct 24, 2023

• 72

upvoted 3 papers about 1 year ago

Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 18

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 145

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 60

Koi

AI & ML interests

Recent Activity

Organizations

KOIIIII's activity

Mixture of Experts Explained

The N Implementation Details of RLHF with PPO