]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281)
authorJeff Bolz <redacted>
Sat, 18 Jan 2025 08:26:50 +0000 (02:26 -0600)
committerGitHub <redacted>
Sat, 18 Jan 2025 08:26:50 +0000 (09:26 +0100)
commit44e18ef93995f3040660750b527e5becf85899d0
treee1da805d26368a7d1c696b3cda9efe19fd6c29c3
parent3edfa7d3753c29e44b964c0ff424d2ea8d5fdee6
vulkan: fix coopmat2 flash attention for non-contiguous inputs (#11281)

Add code similar to mul_mm_cm2 to force alignment of strides, to avoid
a performance regression.

Add noncontiguous FA tests in test-backend-ops.

Fixes #11268.
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp
tests/test-backend-ops.cpp