]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536)
authorEve <redacted>
Sat, 30 Nov 2024 07:00:02 +0000 (07:00 +0000)
committerGeorgi Gerganov <redacted>
Tue, 3 Dec 2024 19:05:37 +0000 (21:05 +0200)
commit6ef9dfb0e6ac2d9091a628c4f4a9a81a9788c8e1
tree7592644f491951d4783c622e96651c41e7ac5de9
parent63dbe2fa33ff871d01acbb6077af45efa7cb6cc4
vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536)

* subgroup 64 version with subgroup add. 15% faster

scalable version

tested for subgroup sizes 16-128

* check for subgroup multiple of 16 and greater than 16

* subgroup sizes are always a power of 2 (https://github.com/KhronosGroup/GLSL/issues/45)

* force 16 sequential threads per block

* make 16 subgroup size a constant
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q6_k.comp