From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning
Paper • 2606.17682 • Published • 14
This is the checkpoints and dataset for: From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning