]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
ggml : same IQ4_NL quantization for CPU/CUDA/Metal (#6196)
authorKawrakow <redacted>
Thu, 21 Mar 2024 12:59:38 +0000 (13:59 +0100)
committerGitHub <redacted>
Thu, 21 Mar 2024 12:59:38 +0000 (14:59 +0200)
commitcfd3be76e37dab92c846d75a2421178f20db4a11
treead28138a7dcd4e39ad864d6ebb9d68f501b7db45
parent5b7b0ac8dfdd800c0fd0dc69b69991e8cb19fb46
ggml : same IQ4_NL quantization for CPU/CUDA/Metal (#6196)

* Make quantize_row_iq4_nl do the same thing is quantization on CUDA

* Make quantize_row_iq4_nl do the same thing is quantization on CUDA

This time for real. backend-ops tests pass.

* Now fix test-quantize-fns

---------

Co-authored-by: Iwan Kawrakow <redacted>
ggml-quants.c