Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM π€
AI & ML interests
Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM π€
Organization Card
Welcome to the home of exciting quantized models! We'd love to see increased adoption of powerful state-of-the-art open models, and quantization is a key component to make them work on more types of hardware.
Resources:
- Llama 3.1 Quantized Models: Optimised Quants of Llama 3.1 for high-throughput deployments! Compatible with Transformers, TGI & VLLM π€.
- Hugging Face Llama Recipes: A set of minimal recipes to get started with Llama 3.1.
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation β’ 3B β’ Updated β’ 3.76k β’ 52 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation β’ 3B β’ Updated β’ 25.9k β’ 27 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation β’ 1B β’ Updated β’ 780k β’ 46 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation β’ 1B β’ Updated β’ 40.4k β’ 21
Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM π€
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation β’ 3B β’ Updated β’ 3.76k β’ 52 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation β’ 3B β’ Updated β’ 25.9k β’ 27 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation β’ 1B β’ Updated β’ 780k β’ 46 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation β’ 1B β’ Updated β’ 40.4k β’ 21
models 21
hugging-quants/Llama-4-Scout-17B-16E-Instruct-fbgemm
Image-Text-to-Text β’ 109B β’ Updated β’ 7 β’ 2
hugging-quants/Llama-4-Scout-17B-16E-Instruct-fbgemm-unfused
Image-Text-to-Text β’ 109B β’ Updated β’ 6 β’ 2
hugging-quants/gemma-2-9b-it-AWQ-INT4
Text Generation β’ 9B β’ Updated β’ 2.43k β’ 8
hugging-quants/Mixtral-8x7B-Instruct-v0.1-AWQ-INT4
Text Generation β’ 47B β’ Updated β’ 15.2k
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation β’ 1B β’ Updated β’ 40.4k β’ 21
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation β’ 1B β’ Updated β’ 780k β’ 46
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation β’ 3B β’ Updated β’ 25.9k β’ 27
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation β’ 3B β’ Updated β’ 3.76k β’ 52
hugging-quants/Meta-Llama-3.1-405B-BNB-NF4
Text Generation β’ 418B β’ Updated β’ 48 β’ 2
hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
Text Generation β’ 423B β’ Updated β’ 74 β’ 5
datasets 0
None public yet
