https://www.lesswrong.com/posts/HLJoJYi52mxgomujc/realistic-reward-hacking-induces-different-and-deeper-1
Sharan Maiya
maius
AI & ML interests
None yet
Recent Activity
updated a model about 19 hours ago
Cadenza-Labs/kimi-k2.6-ST-gender published a model about 20 hours ago
Cadenza-Labs/kimi-k2.6-ST-gender authored a paper about 2 months ago
Open Character Training: Shaping the Persona of AI Assistants through
Constitutional AI