1 19 7

Shi Weikang PRO

swk20

shiwk20

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

liked a dataset about 1 month ago

shiwk24/MathCanvas-Instruct

upvoted a paper 4 months ago

MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 130

liked a dataset about 1 month ago

shiwk24/MathCanvas-Instruct

Viewer • Updated Nov 18, 2025 • 219k • 403 • 4

upvoted 4 papers 4 months ago

MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning

Paper • 2510.14958 • Published Oct 16, 2025 • 23

VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing

Paper • 2509.22651 • Published Sep 26, 2025 • 23

WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

Paper • 2509.22644 • Published Sep 26, 2025 • 21

DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving

Paper • 2510.12796 • Published Oct 14, 2025 • 12

upvoted a paper 5 months ago

FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

Paper • 2509.09680 • Published Sep 11, 2025 • 43

upvoted 3 papers 9 months ago

UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents

Paper • 2505.21496 • Published May 27, 2025 • 38

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Paper • 2505.10557 • Published May 15, 2025 • 47

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Paper • 2505.03733 • Published May 6, 2025 • 17

upvoted a paper 11 months ago

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

liked a dataset 11 months ago

agentica-org/DeepScaleR-Preview-Dataset

Viewer • Updated Feb 10, 2025 • 40.3k • 7.56k • 187

upvoted a paper about 1 year ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8, 2025 • 288

liked a Space about 1 year ago

Scaling test-time compute

📈

591

Implement test-time compute scaling for math problems

reacted to AdinaY's post with 👍 about 1 year ago

Post

1674

🌊 The wave of reasoning models from the Chinese community has arrived!

🚀 Marco-o1 by AIDC, Alibaba
👉 AIDC-AI/Marco-o1

✨ QwQ by Qwen, Alibaba
👉 Qwen/qwq-674762b79b75eac01735070a

🌟 Skywork-o1 by Kunlun Tech
👉 Skywork/skywork-o1-open-67453df58e12f6c3934738d0

🔥 Xkev/Llama-3.2V-11B-cot by PKU Yuan group
👉 Xkev/Llama-3.2V-11B-cot

💡 DeepSeek-R1-Lite-Preview by DeepSeek AI
👉 https://chat.deepseek.com/

🔍 InternThinker Preview by Shanghai AI Lab
👉 https://sso.openxlab.org.cn/login?redirect=https://internlm-chat.intern-ai.org.cn/&clientId=ebmrvod6yo0nlzaek1yp

📘 k0-math by Moonshot AI
🚀 https://kimi.moonshot.cn/ ( coming soon! )

Who's next? 👀
zh-ai-community/reasoning-models-67409fb3aa1ed78f10087cd7

upvoted a paper about 1 year ago

BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices

Paper • 2411.10640 • Published Nov 16, 2024 • 46

upvoted 2 papers over 1 year ago

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Paper • 2410.13861 • Published Oct 17, 2024 • 56

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Paper • 2410.08196 • Published Oct 10, 2024 • 48

commented a paper over 1 year ago

Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning

Paper • 2406.10834 • Published Jun 16, 2024 •

liked a model over 1 year ago

Qwen/Qwen2-Math-72B-Instruct

Text Generation • 73B • Updated Sep 13, 2024 • 135 • • 90

Shi Weikang PRO

AI & ML interests

Recent Activity

Organizations

swk20's activity

Scaling test-time compute