]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: split mul_mat into multiple dispatches to avoid overflow (llama/19509)
authorJeff Bolz <redacted>
Wed, 18 Feb 2026 09:47:10 +0000 (01:47 -0800)
committerGeorgi Gerganov <redacted>
Wed, 25 Feb 2026 10:32:13 +0000 (12:32 +0200)
commit16c7a5053ec1c0d5681a8d0cf3baf898ddf3fd79
tree1cc28fdf269f78f1b9d88cb4bb9c63936762dc2f
parent074d57ee91a0d0504d58afe502f10a703f6e3efd
vulkan: split mul_mat into multiple dispatches to avoid overflow (llama/19509)

* vulkan: split mul_mat into multiple dispatches to avoid overflow

The batch dimensions can be greater than the max workgroup count limit,
in which case we need to split into multiple dispatches and pass the base
index through a push constant.

Fall back for the less common p021 and nc variants.

* address feedback
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/mul_mat_vec_base.glsl
src/ggml-vulkan/vulkan-shaders/mul_mm.comp
src/ggml-vulkan/vulkan-shaders/mul_mm_cm2.comp