llamaperf

GLM-5.1

Zhipu AI · 1 report

Thin page (1 of 3 reports needed for indexing). Add yours.
Tone: positive
quant:
NVFP4

16x DGX Spark cluster with unified memory, serving GLM-5.1-NVFP4 (434GB) at TP=8. Plans to test DeepSeek and Kimi. Future prefill/decode split with M5 Ultra Mac Studios.