]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281)
authorJeff Bolz <redacted>
Sat, 18 Jan 2025 08:26:50 +0000 (02:26 -0600)
committerGeorgi Gerganov <redacted>
Wed, 29 Jan 2025 10:57:00 +0000 (12:57 +0200)
commit93e1a8631d8ec9b4313a3648cd494a6786fbe04b
treef2499a4c49e1b0865b3f3a126c0697cfa6a20556
parentab207130b62274c0f04ad5ad3f2b056f247b7fb2
vulkan: fix coopmat2 flash attention for non-contiguous inputs (llama/11281)

Add code similar to mul_mm_cm2 to force alignment of strides, to avoid
a performance regression.

Add noncontiguous FA tests in test-backend-ops.

Fixes #11268.
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp
tests/test-backend-ops.cpp