arxiv:2504.04823
SunYuxuan
snowdusky
AI & ML interests
None yet
Recent Activity
upvoted a paper about 2 months ago
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation upvoted a paper about 2 months ago
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression upvoted a paper 4 months ago
Scaling Embeddings Outperforms Scaling Experts in Language ModelsOrganizations
None yet