]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
vulkan: fix SSM_CONV PP scaling with large ubatch sizes (#20379)
authorProgenyAlpha <redacted>
Thu, 12 Mar 2026 09:03:18 +0000 (05:03 -0400)
committerGitHub <redacted>
Thu, 12 Mar 2026 09:03:18 +0000 (10:03 +0100)
commit40c550d4f6838180fa40eda8351043353911ca40
tree10e3dada9689c03062f87d0e3b1151dacfd2af93
parentde190154c85d20e24dbeae8c8af1849402ae5098
vulkan: fix SSM_CONV PP scaling with large ubatch sizes (#20379)

* vulkan: optimize SSM_CONV workgroup dispatch for large ubatch

Tile tokens into 2D workgroups (32x16) to reduce workgroup launch
overhead at large ubatch sizes. Add vec4 fast path for nc=4 (common
d_conv size). Fixes PP performance degradation with ubatch > 512.

Ref: ggml-org/llama.cpp#18725

Co-Authored-By: Claude Opus 4.6 <redacted>
* vulkan: remove unused shared memory declaration in SSM_CONV

Co-Authored-By: Claude Opus 4.6 <redacted>
---------

Co-authored-by: Progeny Alpha <redacted>
Co-authored-by: Claude Opus 4.6 <redacted>
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/ssm_conv.comp