Rui Sun PRO
ThreeSR
AI & ML interests
Vision and Language Multimodal Learning, CV, NLP, LLM
Recent Activity
upvoted
a
paper
about 8 hours ago
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization
upvoted
an
article
15 days ago
SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data
upvoted
a
paper
about 1 month ago
RELIC: Interactive Video World Model with Long-Horizon Memory