Loren's picture

1

Loren

lsmc

AI & ML interests

None yet

Recent Activity

new activity 3 months ago

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4:vLLM MTP unusable on RTX 6000 Pro, as spec decoding consumes 20GB+ VRAM at start-up, causing OOM

new activity 3 months ago

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4:vLLM MTP unusable on RTX 6000 Pro, as spec decoding consumes 20GB+ VRAM at start-up, causing OOM

new activity 3 months ago

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4:vLLM MTP unusable on RTX 6000 Pro, as spec decoding consumes 20GB+ VRAM at start-up, causing OOM

View all activity

Organizations

None yet

New activity in nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 3 months ago

vLLM MTP unusable on RTX 6000 Pro, as spec decoding consumes 20GB+ VRAM at start-up, causing OOM

#9 opened 3 months ago by

vLLM MTP unusable on RTX 6000 Pro, as spec decoding consumes 20GB+ VRAM at start-up, causing OOM

#9 opened 3 months ago by

vLLM MTP unusable on RTX 6000 Pro, as spec decoding consumes 20GB+ VRAM at start-up, causing OOM

#9 opened 3 months ago by