Shiyu Huang
ShiyuHuang
AI & ML interests
LLM, VLM, RL, Agent
Recent Activity
updated a collection 2 days ago
auto-research updated a collection 8 days ago
tts upvoted a paper 16 days ago
GenClaw: Code-Driven Agentic Image GenerationOrganizations
auto-research
audio-benchmark
video_benchmark
-
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 83 -
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Paper • 2501.05510 • Published • 44 -
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Paper • 2412.09596 • Published • 97
music_gen
ASR
auto-research
tts
audio-benchmark
streaming_model
video_benchmark
-
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 83 -
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Paper • 2501.05510 • Published • 44 -
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Paper • 2412.09596 • Published • 97
Reasoning
music_gen
llm4code