💼 Hiring

Kai Hua

kkish

·

https://kifish.github.io

AI & ML interests

None yet

Recent Activity

updated a collection 12 days ago

updated a collection 12 days ago

upvoted a paper 27 days ago

AIR: Post-training Data Selection for Reasoning via Attention Head Influence

View all activity

Organizations

updated a collection 12 days ago

Model Released

3 items • Updated 8 days ago

upvoted 2 papers 27 days ago

AIR: Post-training Data Selection for Reasoning via Attention Head Influence

Paper • 2512.13279 • Published Dec 15, 2025 • 2

LLMs are Also Effective Embedding Models: An In-depth Overview

Paper • 2412.12591 • Published Dec 17, 2024 • 2

upvoted 2 papers 29 days ago

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

Paper • 2606.13473 • Published Jun 11 • 93

MiniMax Sparse Attention

Paper • 2606.13392 • Published Jun 11 • 151

updated a collection 30 days ago

Model Released

3 items • Updated 8 days ago

liked a model 30 days ago

MiniMaxAI/MiniMax-M3

Image-Text-to-Text • 427B • Updated 2 days ago • 267k • • 1.33k

reacted to FlameF0X's post with 🚀🔥 about 1 month ago

Post

7231

MiniMax-M3 coming soon.
https://github.com/MiniMax-AI/MiniMax-M3

upvoted a paper about 2 months ago

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

Paper • 2605.19577 • Published May 19 • 59

updated a collection about 2 months ago

Model Released

3 items • Updated 8 days ago

liked a dataset about 2 months ago

Kwai-Klear/GoLongRL

Viewer • Updated May 26 • 23k • 452 • 23

upvoted a collection about 2 months ago

Qwen3-Reranker

3 items • Updated Dec 31, 2025 • 71

upvoted a paper about 2 months ago

OProver: A Unified Framework for Agentic Formal Theorem Proving

Paper • 2605.17283 • Published May 17 • 31

liked a dataset about 2 months ago

xiyuRenBill/MEMLENS

Viewer • Updated 29 days ago • 3.16k • 10k • 8

upvoted 2 papers about 2 months ago

STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?

Paper • 2605.06527 • Published May 7 • 47

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context

Paper • 2605.13831 • Published May 13 • 90

updated a collection 3 months ago

Model Released

3 items • Updated 8 days ago

liked a model 3 months ago

Qwen/Qwen3.6-35B-A3B

Image-Text-to-Text • 36B • Updated Apr 24 • 6.67M • • 2.38k