]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536)
authorRuben Ortlam <redacted>
Wed, 29 Oct 2025 13:39:03 +0000 (14:39 +0100)
committerGitHub <redacted>
Wed, 29 Oct 2025 13:39:03 +0000 (14:39 +0100)
commitbcf5bda6f5df559565d11d7c8e8295c1159a85ec
tree8565f1db9b7080ffd77273116aa686036cc3c514
parent3eb2be1ca5f37480aeb16102970d9e65f43347fe
Vulkan MMQ Integer Dot Refactor and K-Quant support (#16536)

* vulkan: add mmq q2_k integer dot support

* Refactor mmq caching

* Reduce mmq register use

* Load 4 quant blocks into shared memory in one step

* Pack q2_k blocks into caches of 32

* Use 32-bit accumulators for integer dot matmul

* Add q4_k mmq

* Add q3_k mmq

* Add q5_k mmq

* Add q6_k mmq

* Add mxfp4 mmq, enable MMQ MUL_MAT_ID

* Fix mmv dm loads
18 files changed:
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs.glsl
ggml/src/ggml-vulkan/vulkan-shaders/dequant_funcs_cm2.glsl
ggml/src/ggml-vulkan/vulkan-shaders/dequant_mxfp4.comp
ggml/src/ggml-vulkan/vulkan-shaders/dequant_q2_k.comp
ggml/src/ggml-vulkan/vulkan-shaders/dequant_q4_k.comp
ggml/src/ggml-vulkan/vulkan-shaders/dequant_q5_k.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q2_k.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q4_k.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q5_k.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm_funcs.glsl
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm_id_funcs.glsl [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/mul_mmq.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mmq_funcs.glsl
ggml/src/ggml-vulkan/vulkan-shaders/mul_mmq_shmem_types.glsl [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/types.glsl
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp