BenchPreS: A Benchmark for Context-Aware Personalized Preference Selectivity of Persistent-Memory LLMs Paper • 2603.16557 • Published 5 days ago • 20
Safe and Scalable Web Agent Learning via Recreated Websites Paper • 2603.10505 • Published 11 days ago • 25
Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams Paper • 2603.07392 • Published 15 days ago • 17
Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch Paper • 2602.03183 • Published Feb 3 • 11
Lost in the Noise: How Reasoning Models Fail with Contextual Distractors Paper • 2601.07226 • Published Jan 12 • 33
K-EXAONE Collection First journey to foundation models with frontier-level performance. • 4 items • Updated Jan 9 • 35
Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought Paper • 2510.04230 • Published Oct 5, 2025 • 27
ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents Paper • 2509.22830 • Published Sep 26, 2025 • 5
Visual Representation Alignment for Multimodal Large Language Models Paper • 2509.07979 • Published Sep 9, 2025 • 84
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 Aug 5, 2025 • 511
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering Paper • 2505.15805 • Published May 21, 2025 • 3