]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
2-bit quantizations (#4897)
authorKawrakow <redacted>
Sun, 14 Jan 2024 07:45:56 +0000 (09:45 +0200)
committerGitHub <redacted>
Sun, 14 Jan 2024 07:45:56 +0000 (09:45 +0200)
commit147b17ac94a24d524e367cda26a9ff6245689f34
tree6bae34826f82aa28a60ccb26de8eda0464774110
parent807179ec583dcb882f97d9704577c06beb2c5ec9
2-bit quantizations (#4897)

* imatrix: load

* imatrix: WIP

* imatrix: Add Q2_K quantization

* imatrix: also guard against Q2_K_S quantization without importance matrix

* imatrix: guard even more against low-bit quantization misuse

---------

Co-authored-by: Iwan Kawrakow <redacted>
examples/benchmark/benchmark-matmult.cpp
examples/quantize/quantize.cpp
ggml-quants.c
ggml-quants.h
ggml.c
ggml.h
llama.cpp
llama.h
tests/test-backend-ops.cpp