papers Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 8 days ago • 17 APPO: Agentic Procedural Policy Optimization Paper • 2606.12384 • Published 15 days ago • 77
Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 8 days ago • 17
papers Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 8 days ago • 17 APPO: Agentic Procedural Policy Optimization Paper • 2606.12384 • Published 15 days ago • 77
Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation Paper • 2606.18844 • Published 8 days ago • 17