]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (#5721)
authorKawrakow <redacted>
Mon, 26 Feb 2024 16:28:38 +0000 (18:28 +0200)
committerGitHub <redacted>
Mon, 26 Feb 2024 16:28:38 +0000 (18:28 +0200)
commita33e6a0d2a66104ea9a906bdbf8a94d050189d91
tree30478b4a0b1792d1af66c5d64e2c3c4fa1af74ab
parent47bb7b48c7cec9d8f57d56812ce811ec130b89a3
Adding IQ2_S and IQ2_M to complete coverage of the 2-3 bit quantization range (#5721)

* Adding IQ2_S and IQ2_M as a single cumulative commit

* Update examples/quantize/quantize.cpp

Co-authored-by: Georgi Gerganov <redacted>
---------

Co-authored-by: Iwan Kawrakow <redacted>
Co-authored-by: Georgi Gerganov <redacted>
12 files changed:
examples/quantize/quantize.cpp
ggml-cuda.cu
ggml-metal.m
ggml-metal.metal
ggml-quants.c
ggml-quants.h
ggml.c
ggml.h
llama.cpp
llama.h
tests/test-backend-ops.cpp
tests/test-quantize-fns.cpp