From Correctness to Utility: Gain-Based Prefix Evaluation for LLM Reasoning Paper • 2606.07190 • Published 16 days ago • 35
Agentic Environment Engineering for Large Language Models: A Survey of Environment Modeling, Synthesis, Evaluation, and Application Paper • 2606.12191 • Published 11 days ago • 66