]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
2-bit quantizations (llama/4897)
authorKawrakow <redacted>
Sun, 14 Jan 2024 07:45:56 +0000 (09:45 +0200)
committerGeorgi Gerganov <redacted>
Sun, 14 Jan 2024 08:45:23 +0000 (10:45 +0200)
commit82f6044aabaa7ecca3ae88f98a39cb1195e74346
treeaf3c74fb346c7ef3199e57eaea03c1f7ecee034d
parent1890780da4ea10db88736fcde85f285abf6c64b0
2-bit quantizations (llama/4897)

* imatrix: load

* imatrix: WIP

* imatrix: Add Q2_K quantization

* imatrix: also guard against Q2_K_S quantization without importance matrix

* imatrix: guard even more against low-bit quantization misuse

---------

Co-authored-by: Iwan Kawrakow <redacted>
include/ggml/ggml.h
src/ggml-quants.c
src/ggml-quants.h
src/ggml.c
tests/test-backend-ops.cpp