arxiv:2508.10395
Wonjun Kang
wjkang
AI & ML interests
None yet
Recent Activity
upvoted a paper about 11 hours ago
EfficientRollout: System-Aware Self-Speculative Decoding for RL Rollouts upvoted a paper 8 months ago
ParallelBench: Understanding the Trade-offs of Parallel Decoding in
Diffusion LLMs authored a paper 10 months ago
XQuant: Breaking the Memory Wall for LLM Inference with KV Cache
Rematerialization