haipengluo

haipeng1

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 26 days ago

TurnOPD: Making On-Policy Distillation Turn-Aware for Efficient Long-Horizon Agent Training

commentedon a paper about 1 month ago

STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability

commentedon a paper about 2 months ago

STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability

View all activity

Organizations

Papers 5

arxiv:2606.19236

arxiv:2512.20745

arxiv:2505.15431

arxiv:2407.10627

models 1

haipeng1/hp_intern2

Updated Jun 30, 2024

datasets 0

None public yet