KoboldCpp

An inference engine for running open-weight LLMs locally.

0 community reports

This engine doesn't yet have an editorial profile on llamaperf. The community reports below show how it's been used in practice across different hardware.