Running 2 ChinaTravel ๐ข 2 Evaluate and compare AI model performance on ChinaTravel benchmark tasks
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning Paper โข 2510.15444 โข Published Oct 17, 2025 โข 148
ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning Paper โข 2412.13682 โข Published Dec 18, 2024 โข 7
ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning Paper โข 2412.13682 โข Published Dec 18, 2024 โข 7
Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models Paper โข 2502.04404 โข Published Feb 6, 2025 โข 25