]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)
authorRémy O <redacted>
Fri, 28 Feb 2025 08:42:52 +0000 (09:42 +0100)
committerGitHub <redacted>
Fri, 28 Feb 2025 08:42:52 +0000 (09:42 +0100)
commit438a83926afcff3643ffef5543db67545ceffe39
treedb239777851abdd65f6fae3de8777c4f1c65c42d
parent9c42b1718ca8299f9afeabdc122badeab64c9690
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (#11595)

* vulkan: implement specialized MMV kernels for IQ2 quantizations

* vulkan: add MMV kernels for IQ3 quants

* vulkan: Increase MMV batch size and unroll IQ LUT setup

* vulkan: fix init_iq_shmem for WG sizes larger than tables

* vulkan: common batch size for all I-quants
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/get_rows_quant.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq2_s.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq2_xs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq2_xxs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq3_s.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq3_xxs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/types.comp
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp