git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	ProgenyAlpha <redacted>
	Thu, 12 Mar 2026 09:03:18 +0000 (05:03 -0400)
committer	GitHub <redacted>
	Thu, 12 Mar 2026 09:03:18 +0000 (10:03 +0100)
commit	40c550d4f6838180fa40eda8351043353911ca40
tree	10e3dada9689c03062f87d0e3b1151dacfd2af93	tree
parent	de190154c85d20e24dbeae8c8af1849402ae5098	commit \| diff

vulkan: fix SSM_CONV PP scaling with large ubatch sizes (#20379)

* vulkan: optimize SSM_CONV workgroup dispatch for large ubatch

Tile tokens into 2D workgroups (32x16) to reduce workgroup launch
overhead at large ubatch sizes. Add vec4 fast path for nc=4 (common
d_conv size). Fixes PP performance degradation with ubatch > 512.

Ref: ggml-org/llama.cpp#18725

Co-Authored-By: Claude Opus 4.6 <redacted>
* vulkan: remove unused shared memory declaration in SSM_CONV

Co-Authored-By: Claude Opus 4.6 <redacted>
---------

Co-authored-by: Progeny Alpha <redacted>
Co-authored-by: Claude Opus 4.6 <redacted>

ggml/src/ggml-vulkan/ggml-vulkan.cpp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/ssm_conv.comp		diff \| blob \| history