]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
Make IQ1_M work for QK_K = 64 (llama/6327)
authorKawrakow <redacted>
Wed, 27 Mar 2024 07:44:27 +0000 (08:44 +0100)
committerGeorgi Gerganov <redacted>
Wed, 27 Mar 2024 11:20:00 +0000 (13:20 +0200)
commit07d4b6b80214d17713a3d69b9579a6613dec869a
tree49c2304cbbb3982d22533a794856f32e9db2657b
parent1ac50846770901f904bc5222faad2c917a3e4922
Make IQ1_M work for QK_K = 64 (llama/6327)

* iq1_m: make it work for QK_K = 64 (WIP)

* iq1_m: make it work for QK_K = 64 (scalar and AVX2)

* iq1_m: QK_K = 64 seems to work on Metal and ARM_NEON

---------

Co-authored-by: Iwan Kawrakow <redacted>
src/ggml-common.h
src/ggml-metal.metal
src/ggml-quants.c