]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: fix SSM_CONV PP scaling with large ubatch sizes (llama/20379)
authorProgenyAlpha <redacted>
Thu, 12 Mar 2026 09:03:18 +0000 (05:03 -0400)
committerGeorgi Gerganov <redacted>
Sun, 15 Mar 2026 19:50:13 +0000 (21:50 +0200)
commit314c3ae4c94b454cd0361253b7c42d4a51329ce6
treee829ee95c28a5e29bc81d6959de3749d837dcf8e
parentf24800be624bc2d8ce0e92c092d2816da41bc730
vulkan: fix SSM_CONV PP scaling with large ubatch sizes (llama/20379)

* vulkan: optimize SSM_CONV workgroup dispatch for large ubatch

Tile tokens into 2D workgroups (32x16) to reduce workgroup launch
overhead at large ubatch sizes. Add vec4 fast path for nc=4 (common
d_conv size). Fixes PP performance degradation with ubatch > 512.

Ref: ggml-org/llama.cpp#18725

Co-Authored-By: Claude Opus 4.6 <redacted>
* vulkan: remove unused shared memory declaration in SSM_CONV

Co-Authored-By: Claude Opus 4.6 <redacted>
---------

Co-authored-by: Progeny Alpha <redacted>
Co-authored-by: Claude Opus 4.6 <redacted>
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/ssm_conv.comp