git.djapps.eu Git - pkg/ggml/sources/ggml/commit

author	Jeff Bolz <redacted>
	Mon, 20 Jan 2025 16:38:32 +0000 (10:38 -0600)
committer	Georgi Gerganov <redacted>
	Wed, 29 Jan 2025 10:57:00 +0000 (12:57 +0200)
commit	b73eb58aef62ef7262b827b31297ad038f47b78b
tree	6872f92c19814da1e44c37ccce645e6d407166ca	tree
parent	91c426d2f8daa0aa7d8a1ba5ca8940523766fa99	commit \| diff

vulkan: fix coopmat2 validation failures (llama/11284)

mul mat and flash attention shaders were loading f32 types directly into
A/B matrices, which happens to work but is technically invalid usage.
For FA, we can load it as an Accumulator matrix and convert and this
is not in the inner loop and is cheap enough. For mul mat, it's more
efficient to do this conversion in a separate pass and have the input(s)
be f16.

coopmat2 requires SPIR-V 1.6 (related using to LocalSizeId). LocalSizeId
requires maintenance4 be enabled, and SPIR-V 1.6 requires Vulkan 1.3.

src/ggml-vulkan/ggml-vulkan.cpp		diff \| blob \| history
src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp		diff \| blob \| history
src/ggml-vulkan/vulkan-shaders/mul_mm_cm2.comp		diff \| blob \| history
src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp		diff \| blob \| history