arxiv:2510.25110
Ruixuan Tu
TURX
AI & ML interests
LLMs & NLP
Recent Activity
authored a paper about 1 month ago
DEBATE: A Large-Scale Benchmark for Role-Playing LLM Agents in Multi-Agent, Long-Form Debates authored a paper 7 months ago
FaithBench: A Diverse Hallucination Benchmark for Summarization by
Modern LLMs authored a paper 7 months ago
Is Semantic Chunking Worth the Computational Cost?Organizations
None yet