]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (llama/11595)
authorRémy O <redacted>
Fri, 28 Feb 2025 08:42:52 +0000 (09:42 +0100)
committerGeorgi Gerganov <redacted>
Sat, 8 Mar 2025 13:13:01 +0000 (15:13 +0200)
commit3bab804981d28c397aef3449f6f6114ee4022366
treef828414fe47e460ff0e4715a478cfb4093fd273c
parentc927830a70c6fc57b2a743ffeb9b07a655155889
vulkan: add specific MMV kernels for IQ2 and IQ3 quants + optimizations (llama/11595)

* vulkan: implement specialized MMV kernels for IQ2 quantizations

* vulkan: add MMV kernels for IQ3 quants

* vulkan: Increase MMV batch size and unroll IQ LUT setup

* vulkan: fix init_iq_shmem for WG sizes larger than tables

* vulkan: common batch size for all I-quants
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/get_rows_quant.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq2_s.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq2_xs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq2_xxs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq3_s.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_iq3_xxs.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/types.comp
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp