1 62 2

YanxingLiu

lyx98

YanxingLiu

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 8 days ago

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

upvoted a paper 8 days ago

SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills

upvoted an article 20 days ago

NEO-unify: Building Native Multimodal Unified Models End to End

View all activity

Organizations

None yet

upvoted 2 papers 8 days ago

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Paper • 2605.27365 • Published 9 days ago • 135

SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills

Paper • 2605.24117 • Published 13 days ago • 20

upvoted an article 20 days ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 164

upvoted a paper 20 days ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published 21 days ago • 85

upvoted a paper 26 days ago

Lightning Unified Video Editing via In-Context Sparse Attention

Paper • 2605.04569 • Published 29 days ago • 18

upvoted a paper 28 days ago

ARIS: Autonomous Research via Adversarial Multi-Agent Collaboration

Paper • 2605.03042 • Published about 1 month ago • 125

upvoted a paper 30 days ago

Let ViT Speak: Generative Language-Image Pre-training

Paper • 2605.00809 • Published May 1 • 33

upvoted 2 papers about 1 month ago

UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors

Paper • 2605.00658 • Published May 1 • 84

Elucidating the SNR-t Bias of Diffusion Probabilistic Models

Paper • 2604.16044 • Published Apr 17 • 73

upvoted 2 papers about 2 months ago

LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories

Paper • 2604.15311 • Published Apr 16 • 13

OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation

Paper • 2604.11804 • Published Apr 13 • 72

upvoted 2 papers 3 months ago

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Paper • 2602.24286 • Published Feb 27 • 99

DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

Paper • 2602.21548 • Published Feb 25 • 53

upvoted 3 papers 4 months ago

Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception

Paper • 2602.11858 • Published Feb 12 • 64

ERNIE 5.0 Technical Report

Paper • 2602.04705 • Published Feb 4 • 269

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Paper • 2602.01785 • Published Feb 2 • 97

upvoted a paper 6 months ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 246

upvoted 2 papers 7 months ago

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published Nov 12, 2025 • 216

DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published Nov 7, 2025 • 47

upvoted a paper 9 months ago

Visual Programmability: A Guide for Code-as-Thought in Chart Understanding

Paper • 2509.09286 • Published Sep 11, 2025 • 11

YanxingLiu

AI & ML interests

Recent Activity

Organizations

lyx98's activity

NEO-unify: Building Native Multimodal Unified Models End to End