From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning Paper • 2606.17682 • Published 8 days ago • 26
Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning Paper • 2606.13106 • Published 13 days ago • 21
Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It Paper • 2606.11052 • Published 15 days ago • 16
Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation Paper • 2606.06428 • Published 20 days ago • 25
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs Paper • 2605.30501 • Published 27 days ago • 29
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published May 14 • 147
ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions Paper • 2605.20087 • Published May 19 • 18
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL Paper • 2605.18703 • Published May 18 • 50
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Paper • 2602.17684 • Published Feb 4 • 22
Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13, 2025 • 108
Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration Paper • 2508.13755 • Published Aug 19, 2025 • 14