]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
Vulkan MMQ Integer Dot Refactor and K-Quant support (llama/16536)
authorRuben Ortlam <redacted>
Wed, 29 Oct 2025 13:39:03 +0000 (14:39 +0100)
committerGeorgi Gerganov <redacted>
Sat, 1 Nov 2025 07:41:35 +0000 (09:41 +0200)
commit558148d7422b9a6d867fd5dfae8e9974c212dc57
tree9fdbdea3dcc501429b1e69ec0309b04f386cc613
parent4f455950b0f665fda5bdc4cfe5afbe67f515529f
Vulkan MMQ Integer Dot Refactor and K-Quant support (llama/16536)

* vulkan: add mmq q2_k integer dot support

* Refactor mmq caching

* Reduce mmq register use

* Load 4 quant blocks into shared memory in one step

* Pack q2_k blocks into caches of 32

* Use 32-bit accumulators for integer dot matmul

* Add q4_k mmq

* Add q3_k mmq

* Add q5_k mmq

* Add q6_k mmq

* Add mxfp4 mmq, enable MMQ MUL_MAT_ID

* Fix mmv dm loads
18 files changed:
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/dequant_funcs.glsl
src/ggml-vulkan/vulkan-shaders/dequant_funcs_cm2.glsl
src/ggml-vulkan/vulkan-shaders/dequant_mxfp4.comp
src/ggml-vulkan/vulkan-shaders/dequant_q2_k.comp
src/ggml-vulkan/vulkan-shaders/dequant_q4_k.comp
src/ggml-vulkan/vulkan-shaders/dequant_q5_k.comp
src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q2_k.comp
src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q4_k.comp
src/ggml-vulkan/vulkan-shaders/mul_mat_vec_q5_k.comp
src/ggml-vulkan/vulkan-shaders/mul_mm.comp
src/ggml-vulkan/vulkan-shaders/mul_mm_funcs.glsl
src/ggml-vulkan/vulkan-shaders/mul_mm_id_funcs.glsl [new file with mode: 0644]
src/ggml-vulkan/vulkan-shaders/mul_mmq.comp
src/ggml-vulkan/vulkan-shaders/mul_mmq_funcs.glsl
src/ggml-vulkan/vulkan-shaders/mul_mmq_shmem_types.glsl [new file with mode: 0644]
src/ggml-vulkan/vulkan-shaders/types.glsl
src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp