]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Q4_1 quantization (#193)
authorMatvey Soloviev <redacted>
Fri, 17 Mar 2023 04:48:39 +0000 (05:48 +0100)
committerGitHub <redacted>
Fri, 17 Mar 2023 04:48:39 +0000 (06:48 +0200)
commit904d2a8d6acd667c9633138d45a361d40fbf76d0
tree01494c1704cc5c7e5d95ae01edfae3a5df104300
parent721311070e31464ac12bef9a4444093eb3eaebf7
Q4_1 quantization (#193)

* Add AVX2 version of ggml_vec_dot_q4_1

* Small optimisations to q4_1 dot product (@Const-me)

* Rearrange Q4_1 quantization to work for multipart models. (Fix #152)

* Fix ggml_vec_mad_q4_1 too

* Fix non-vectorised q4_1 vec mul
ggml.c
utils.cpp