Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps Paper • 2605.16928 • Published 11 days ago • 90
TernaryCLIP: Efficiently Compressing Vision-Language Models with Ternary Weights and Distilled Knowledge Paper • 2510.21879 • Published Oct 23, 2025
EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation Paper • 2605.04062 • Published Apr 10 • 30
EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation Paper • 2605.04062 • Published Apr 10 • 30 • 1
EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation Paper • 2605.04062 • Published Apr 10 • 30
zhangsq-nju/Qwen3-1.7B-EdgeRazor-GGUF Text Generation • 2B • Updated 16 days ago • 1.07k • 10