12 34 4

fulong ye

Alon77777

https://scholar.google.com.hk/citations?hl=zh-CN&user=-BbQ5VgAAAAJ

superhero-7

AI & ML interests

vision and language, diffusion model, text-to-image generation, image-to-text generation, referring expression generation and comprehension

Recent Activity

authored a paper 1 day ago

SPRING: Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph

authored a paper 1 day ago

AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities

authored a paper 1 day ago

AltDiffusion: A Multilingual Text-to-Image Diffusion Model

View all activity

Organizations

authored 7 papers 1 day ago

SPRING: Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph

Paper • 2301.01949 • Published Jan 5, 2023

AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities

Paper • 2211.06679 • Published Nov 12, 2022 • 2

AltDiffusion: A Multilingual Text-to-Image Diffusion Model

Paper • 2308.09991 • Published Aug 19, 2023 • 3

InstructX: Towards Unified Visual Editing with MLLM Guidance

Paper • 2510.08485 • Published Oct 9, 2025 • 18

DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

Paper • 2601.01425 • Published Jan 4 • 53

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

Paper • 2602.12160 • Published Feb 12 • 38

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published 3 days ago • 133

upvoted a paper 1 day ago

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published 3 days ago • 133

upvoted a paper 24 days ago

Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models

Paper • 2603.22212 • Published 25 days ago • 126

upvoted a paper about 2 months ago

DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation

Paper • 2602.12160 • Published Feb 12 • 38

commented a paper 3 months ago

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Paper • 2601.14250 • Published Jan 20 • 48 •

authored a paper 3 months ago

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Paper • 2601.14250 • Published Jan 20 • 48

upvoted 3 papers 3 months ago

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Paper • 2601.14250 • Published Jan 20 • 48

DreamStyle: A Unified Framework for Video Stylization

Paper • 2601.02785 • Published Jan 6 • 24

DreamID-V:Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer

Paper • 2601.01425 • Published Jan 4 • 53

upvoted 2 papers 7 months ago

OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models

Paper • 2509.17627 • Published Sep 22, 2025 • 66

UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward

Paper • 2509.06818 • Published Sep 8, 2025 • 29

upvoted a paper 8 months ago

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

Paper • 2508.18966 • Published Aug 26, 2025 • 56

upvoted a paper 10 months ago

Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset

Paper • 2506.18851 • Published Jun 23, 2025 • 30

upvoted a paper 11 months ago

Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors

Paper • 2505.24625 • Published May 30, 2025 • 9

fulong ye

AI & ML interests

Recent Activity

Organizations

Alon77777's activity