arxiv:2606.19236
haipengluo
haipeng1
AI & ML interests
None yet
Recent Activity
commentedon a paper about 3 hours ago
STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability commentedon a paper about 3 hours ago
STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability authored a paper about 11 hours ago
WizardMath: Empowering Mathematical Reasoning for Large Language Models
via Reinforced Evol-Instruct