Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination
Abstract
Atomic Decomposition and Recombination (ADR) framework generates novel and challenging verifiable code tasks for scalable reinforcement learning with verifiable rewards in large language models.
Reinforcement Learning with Verifiable Rewards (RLVR) has recently emerged as the cornerstone for shaping the remarkable coding abilities of Large Language Models (LLMs). However, the scalability of RLVR is severely constrained by the scarcity of sufficiently challenging verifiable code tasks that target near the model's edge of competence. Prior studies often rely on heuristic seed expansions for data synthesis, which severely limits both novelty and difficulty. Consequently, the training value of such data fails to scale proportionally with the size of its synthesis. To this end, we propose Atomic Decomposition and Recombination (ADR), a novel framework that generates verifiable code tasks via decomposition into atomic elements and controlled recombination, thereby enabling the generation of genuinely novel and challenging verifiable code tasks. Experiments and analysis demonstrate that ADR achieves superior originality, difficulty, diversity, and test quality over existing baselines, and consistently delivers greater improvements in code ability across RLVR in diverse downstream domains, including algorithmic programming, tool usage, and data science. Our work sheds light on a new paradigm for novel code task synthesis and scalable RLVR training.
Community
We propose Atomic Decomposition and Recombination (ADR), a novel framework that generates verifiable code tasks via decomposition into atomic elements and controlled recombination, thereby enabling the generation of genuinely novel and challenging verifiable code tasks.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Evolutionary Task Discovery: Advancing Reasoning Frontiers via Skill Composition and Complexity Scaling (2026)
- MathAgent: Adversarial Evolution of Constraint Graphs for Mathematical Reasoning Data Synthesis (2026)
- Scaling Agentic Capabilities via Grounded Interaction Synthesis (2026)
- EVE: Verifiable Self-Evolution of MLLMs via Executable Visual Transformations (2026)
- Knowledge-to-Verification: Exploring RLVR for LLMs in Knowledge-Intensive Domains (2026)
- ZeroCoder: Can LLMs Improve Code Generation Without Ground-Truth Supervision? (2026)
- Improving LLM Code Generation via Requirement-Aware Curriculum Reinforcement Learning (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2605.31058 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper