FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 119
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation Paper • 2504.07448 • Published Apr 10, 2025 • 1
deepseek-ai/DeepSeek-Coder-V2-Instruct-0724 Text Generation • 236B • Updated Oct 8, 2024 • 3.31k • 115