]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llama : add Q3_K_XS (#5060)
authorKawrakow <redacted>
Mon, 22 Jan 2024 10:43:33 +0000 (12:43 +0200)
committerGitHub <redacted>
Mon, 22 Jan 2024 10:43:33 +0000 (12:43 +0200)
commit66d575c45c5a370d668f9c3283cdf348e2329fa2
tree035e052b116f301508225f897f1943e6eb1b3e19
parent57744932c64266359ee905518de7e096c0295d8c
llama : add Q3_K_XS (#5060)

* Add Q3_K_XS - intermediate size between Q2_K and Q3_K_S

* Q3_K_XS: quanize first 1/8 of ffn_down layers with Q4_K

Together with an importance matrix, this brings perplexity
for LLaMA-v2-70B below the perplexity of the former Q2_K
with a 800 MB smaller quantized model size.

---------

Co-authored-by: Iwan Kawrakow <redacted>
examples/quantize/quantize.cpp
llama.cpp
llama.h