]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
ggml : same IQ4_NL quantization for CPU/CUDA/Metal (llama/6196)
authorKawrakow <redacted>
Thu, 21 Mar 2024 12:59:38 +0000 (13:59 +0100)
committerGeorgi Gerganov <redacted>
Wed, 27 Mar 2024 11:20:00 +0000 (13:20 +0200)
commitf5fc72a161c1d210665be5e64ca11421b0e93e1c
tree8fcbaf2adcc3d03dcc846904c59c9d5de8a0d6a5
parent36143de86ba4ae32e81252df41367d6b37c0cf57
ggml : same IQ4_NL quantization for CPU/CUDA/Metal (llama/6196)

* Make quantize_row_iq4_nl do the same thing is quantization on CUDA

* Make quantize_row_iq4_nl do the same thing is quantization on CUDA

This time for real. backend-ops tests pass.

* Now fix test-quantize-fns

---------

Co-authored-by: Iwan Kawrakow <redacted>
src/ggml-quants.c