Qwen3.6 27B
M3 Max 128GB · MLX · 290,000 ctx
- throughput:
- 5.5 t/s gen · 160.0 t/s pp
- quant:
- Q8 (mlx)
long-context
User reports 160 tok/s prefill, 5-6 tok/s generation on M5 Max 128GB with Qwen 3.6 27B Q8 MLX at 290k context. GPU utilization only 36-50%, feels off compared to expected 8-14 tok/s generation. Asks for comparison with other setups.