]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
vulkan: split mul_mat into multiple dispatches to avoid overflow (#19509)
authorJeff Bolz <redacted>
Wed, 18 Feb 2026 09:47:10 +0000 (01:47 -0800)
committerGitHub <redacted>
Wed, 18 Feb 2026 09:47:10 +0000 (10:47 +0100)
commitd0061be838809230db7a4edf62bc9a098025ba98
treefc47ebab4a36c6aa8f9cbc46bd6a3eb99b290edd
parenta569bda44579f64fa2676063848c6d2a8c5f7b30
vulkan: split mul_mat into multiple dispatches to avoid overflow (#19509)

* vulkan: split mul_mat into multiple dispatches to avoid overflow

The batch dimensions can be greater than the max workgroup count limit,
in which case we need to split into multiple dispatches and pass the base
index through a push constant.

Fall back for the less common p021 and nc variants.

* address feedback
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_base.glsl
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm.comp
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm_cm2.comp