llamaperf

Qwen2.5

Alibaba · 2 reports

Thin page (2 of 3 reports needed for indexing). Add yours.

Benchmark of abliteration tools (Apostate, Huihui, Heretic) on Qwen 2.5 7B. Evaluated with lm-evaluation-harness via vLLM 0.19.0, bf16 on RTX 5090 32GB. Reports MMLU, GSM8K, HellaSwag, ARC Challenge, WinoGrande, TruthfulQA MC2, PiQA, LAMBADA ppl, HarmBench ASR, KL divergence. No tokens/sec reported.

Qwen2.5 32B Coder

RTX 3090 · llama.cpp · 32,768 ctx

Tone: mixed
throughput:
28.0 t/s gen · 450.0 t/s pp
quant:
Q4_K_M (gguf)
kv:
Q8
coding

Solid for autocomplete, occasionally hallucinates imports in multi-file refactors. Build b4400.