- throughput:
- 60.0 t/s gen
- quant:
- FP16 (safetensors)
text-generation
Gemma 4 26B MoE on Instinct MI300X via vLLM + ROCm. Full FP16. MoE active params make it very fast. Source: gemma4-ai.com AMD GPU guide
AMD · 192GB · 2 reports
Gemma 4 26B MoE on Instinct MI300X via vLLM + ROCm. Full FP16. MoE active params make it very fast. Source: gemma4-ai.com AMD GPU guide
Gemma 4 31B on Instinct MI300X via vLLM + ROCm. Full FP16. Datacenter grade. Source: gemma4-ai.com AMD GPU guide