nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4-test011
Updated
nm-testing/TinyLlama-1.1B-Chat-v1.0-NVFP4-test
Updated
nm-testing/Kimi-Linear-48B-A3B-Instruct-FP8-DYNAMIC
49B • Updated • 14
nm-testing/llama2.c-stories42M-pruned2.4
Updated • 167
nm-testing/gpt-oss-20B.eagle3.unconverted-drafter
nm-testing/random-weights-llama3.1.8b-2layer-eagle3-unconverted
Updated • 163
nm-testing/Llama-4-Scout-17B-16E-Instruct-BLOCK-FP8
Text Generation
• 109B • Updated • 10
• 1
nm-testing/Llama-4-Maverick-17B-128E-Instruct-block-FP8
Text Generation
• Updated • 9
nm-testing/Qwen3-VL-235B-A22B-Instruct-FP8-BLOCK
Text Generation
• Updated nm-testing/Qwen3-30B-A3B-FP8-block
Text Generation
• 3B • Updated • 6
nm-testing/granite-4.0-h-small-FP8-dynamic-test
Updated
nm-testing/tiny-testing-random-weights
584k • Updated • 2.17k
nm-testing/Llama4-Maverick-Eagle3-Speculators-64k-vocab
Updated
nm-testing/Llama-3.1-8B-Instruct-KV-FP8-tensor-static_minmax
8B • Updated • 1
nm-testing/Llama-3.1-8B-Instruct-QKV-FP8-attn_head-static_minmax
8B • Updated nm-testing/Llama-3.1-8B-Instruct-KV-FP8-attn_head-static_minmax
8B • Updated nm-testing/Llama-3.1-8B-Instruct-QKV-FP8-tensor-static_minmax
8B • Updated nm-testing/Llama-3.1-8B-Instruct-QKV-FP8-Head
8B • Updated • 1
nm-testing/Llama-3.1-8B-Instruct-QKV-FP8-Tensor
8B • Updated • 1
nm-testing/Llama-3.1-8B-Instruct-KV-FP8-Tensor
8B • Updated • 2
nm-testing/NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16
2B • Updated • 29
nm-testing/Qwen3-VL-8B-Instruct-W4A16
3B • Updated • 94
nm-testing/Qwen3-VL-8B-Instruct-NVFP4
6B • Updated • 64.8k
• 3
nm-testing/Qwen3-VL-4B-Instruct-NVFP4
3B • Updated • 452
• 2
nm-testing/Llama-3.1-8B-Instruct-NVFP4-mse
5B • Updated nm-testing/Llama-3.1-8B-Instruct-NVFP4-static_minmax
5B • Updated • 3
nm-testing/EAGLE3-LLaMA3.1-Instruct-8B-sgl
nm-testing/Speculator-Qwen3-8B-Eagle3-converted-071-quantized-w4a16-sgl
nm-testing/Llama-3.2-1B-Instruct-attention-fp8-head
1B • Updated • 1