VRAM Calculator

Pick your GPU — see which open-weight models fit in VRAM, at which quantization, and roughly how fast they run. Tokens/sec comes from community-reported benchmarks when we have them.

GPU

System RAM

Context length

Quality

0 fits fully·4 with offload·0 won't run

All 4 reports for this GPU →

Offload

Qwen3.5

27B params·Alibaba·Q8_0

33.5 GB

—

Offload

Gemma 4

30.7B params·Google DeepMind·Q6_K

32.3 GB

—

Offload

Gemma 3

27B params·Google·Q6_K

28.9 GB

—

Offload

Qwen3.6

3B / 35B·Alibaba·FP16

11.9 GB

—