git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Jeff Bolz <redacted>
	Mon, 20 Jan 2025 16:38:32 +0000 (10:38 -0600)
committer	GitHub <redacted>
	Mon, 20 Jan 2025 16:38:32 +0000 (10:38 -0600)
commit	aea8ddd5165d525a449e2fc3839db77a71f4a318
tree	66841c3fd8c5c9db473d13c7d0d6b73c3e47ae26	tree
parent	9f7add1cde670ff744b791f5ed6649e86a0c586c	commit \| diff

vulkan: fix coopmat2 validation failures (#11284)

mul mat and flash attention shaders were loading f32 types directly into
A/B matrices, which happens to work but is technically invalid usage.
For FA, we can load it as an Accumulator matrix and convert and this
is not in the inner loop and is cheap enough. For mul mat, it's more
efficient to do this conversion in a separate pass and have the input(s)
be f16.

coopmat2 requires SPIR-V 1.6 (related using to LocalSizeId). LocalSizeId
requires maintenance4 be enabled, and SPIR-V 1.6 requires Vulkan 1.3.

ggml/src/ggml-vulkan/ggml-vulkan.cpp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm_cm2.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp		diff \| blob \| history