]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
2-bit quantizations (llama/4897)
authorKawrakow <redacted>
Sun, 14 Jan 2024 07:45:56 +0000 (09:45 +0200)
committerGeorgi Gerganov <redacted>
Sun, 14 Jan 2024 08:54:09 +0000 (10:54 +0200)
commitdabc964d8314091a779d64b4ad617bb2eec6f7fb
tree26053eba5ff3ab7d314897b26e26b043595199c3
parent654baf693d6d343585e6a014c3c20674754c64d1
2-bit quantizations (llama/4897)

* imatrix: load

* imatrix: WIP

* imatrix: Add Q2_K quantization

* imatrix: also guard against Q2_K_S quantization without importance matrix

* imatrix: guard even more against low-bit quantization misuse

---------

Co-authored-by: Iwan Kawrakow <redacted>
ggml-quants.c
ggml-quants.h
ggml.c
ggml.h