llamaperf

Qwen2.5

Alibaba · 1 report

Thin page (1 of 3 reports needed for indexing). Add yours.

Qwen2.5 32B Coder

RTX 3090 · llama.cpp · 32,768 ctx

Tone: mixed
throughput:
28.0 t/s gen · 450.0 t/s pp
quant:
Q4_K_M (gguf)
kv:
Q8
coding

Solid for autocomplete, occasionally hallucinates imports in multi-file refactors. Build b4400.