]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml : add Q8_0 quantization for intermediate results (#951)
authorGeorgi Gerganov <redacted>
Sat, 15 Apr 2023 14:53:22 +0000 (17:53 +0300)
committerGitHub <redacted>
Sat, 15 Apr 2023 14:53:22 +0000 (17:53 +0300)
commite95b6554b493e71a0275764342e09bd5784a7026
tree6b9d3e9d4eb23b64ae76f0108b409aa5825cd1b8
parentaa485cee334e84437e21681c14b6f80b65876d8b
ggml : add Q8_0 quantization for intermediate results (#951)

* ggml : add Q8_0 quantization for intermediate results

* quantize-stats : fix test + add it to Makefile default

* Q8: use int8_t, AVX/AVX2 optimizations

* ggml : fix quantize_row_q8_0() ARM_NEON rounding

* minor : updates after rebase to latest master

* quantize-stats : delete obsolete strings

* ggml : fix q4_1 dot func

---------

Co-authored-by: Stephan Walter <redacted>
Makefile
ggml.c
ggml.h