Ofer Hasson's picture

Ofer Hasson

hassonofer

·

hassonofer

AI & ML interests

Computer Vision

Recent Activity

upvoted a collection about 5 hours ago

Perception Encoder

reacted to Anran-MLLM's post with 👍 about 5 hours ago

🚀 Introducing PerceptionDLM — the first multimodal diffusion LLM for parallel region perception! Most MLLMs are autoregressive, so captioning N regions costs N sequential passes. PerceptionDLM instead describes ALL masked regions in a single denoising process. 🧩 ✨ Highlights • ⚡ Up to 3.4× faster on dense multi-region captioning, with stable per-image latency • 🏆 PerceptionDLM-Base beats LLaDA-V on 15/16 multimodal benchmarks (new SOTA among open diffusion VLMs) • 📊 New benchmark: ParaDLC-Bench — jointly evaluates caption quality AND inference efficiency • 🔓 Code, models & benchmark all open-sourced 🤖 Models https://huggingface.co/MSALab/PerceptionDLM-Base https://huggingface.co/MSALab/PerceptionDLM 📊 Benchmark https://huggingface.co/datasets/MSALab/ParaDLC-Bench 📄 Paper: https://huggingface.co/papers/2606.19534 💻 Code: https://github.com/MSALab-PKU/PerceptionDLM Diffusion LLMs aren't just for text — they unlock efficient, parallel visual perception. 👁️✨ #multimodal #diffusion #VLM #perception

liked a model about 18 hours ago

MiniMaxAI/MiniMax-M3

View all activity

Organizations

models 4

hassonofer/vit_reg1_s14_ls_dino-v2-dist-bio

Image Feature Extraction • Updated May 18 • 4 • 1

hassonofer/vit_so150m_patch14_reg4_biodino_336

Image Feature Extraction • Updated May 10 • 2

hassonofer/vit_so150m_patch14_reg4_biodino_252

Image Feature Extraction • Updated May 10 • 2

hassonofer/vit_so150m_patch14_reg4_biodino_224

Image Feature Extraction • Updated May 10 • 3

datasets 0

None public yet