CityRAG: Stepping Into a City via Spatially-Grounded Video Generation Paper • 2604.19741 • Published Apr 21 • 17
AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model Paper • 2604.19747 • Published Apr 21 • 40
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 909
Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b-Logprob Viewer • Updated Jan 15 • 435k • 5.84k • 63
KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs Paper • 2601.01046 • Published Jan 3 • 14
MediaTek-Research/Breeze-ASR-25 Automatic Speech Recognition • 2B • Updated Jul 8, 2025 • 7.86k • 129
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models Paper • 2511.11007 • Published Nov 14, 2025 • 15