Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 13 days ago • 190
Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design Paper • 2605.15871 • Published 10 days ago • 16
Many-Shot CoT-ICL: Making In-Context Learning Truly Learn Paper • 2605.13511 • Published 12 days ago • 32
SEIF: Self-Evolving Reinforcement Learning for Instruction Following Paper • 2605.07465 • Published 17 days ago • 29
A Benchmark for Interactive World Models with a Unified Action Generation Framework Paper • 2605.03941 • Published 20 days ago • 5
Pseudo-Unification: Entropy Probing Reveals Divergent Information Patterns in Unified Multimodal Models Paper • 2604.10949 • Published Apr 13 • 40
hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-noece-noaurc-scaletrue-cold-5x-priority-overconf-math Updated Apr 12 • 1
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 629