git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

author	Jeff Bolz <redacted>
	Mon, 20 Jan 2025 16:38:32 +0000 (10:38 -0600)
committer	Georgi Gerganov <redacted>
	Mon, 3 Feb 2025 20:00:57 +0000 (22:00 +0200)
commit	0dcada42d4af59d66bf7e48c4d9fac842e8285b8
tree	ee8a7a74a1c78fd33e3f8ac76fc1187fe83f3652	tree
parent	d507b4cebe085da9b99aaebff82b5282d64279a2	commit \| diff

vulkan: fix coopmat2 validation failures (llama/11284)

mul mat and flash attention shaders were loading f32 types directly into
A/B matrices, which happens to work but is technically invalid usage.
For FA, we can load it as an Accumulator matrix and convert and this
is not in the inner loop and is cheap enough. For mul mat, it's more
efficient to do this conversion in a separate pass and have the input(s)
be f16.

coopmat2 requires SPIR-V 1.6 (related using to LocalSizeId). LocalSizeId
requires maintenance4 be enabled, and SPIR-V 1.6 requires Vulkan 1.3.

ggml/src/ggml-vulkan/ggml-vulkan.cpp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/mul_mm_cm2.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp		diff \| blob \| history