]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
vulkan: use aligned loads for flash attention mask (#12853)
authorJeff Bolz <redacted>
Sat, 12 Apr 2025 08:44:48 +0000 (03:44 -0500)
committerGitHub <redacted>
Sat, 12 Apr 2025 08:44:48 +0000 (10:44 +0200)
commita4837577aae59c0ae640f3810094724bcac6bb28
tree9d7ece5aa79a25e62f7a7ce24b20a26a33b6fcf3
parente59ea539b83d2c7947c99bd350549364dbba450c
vulkan: use aligned loads for flash attention mask (#12853)

Rewrite the stride logic for the mask tensor in the FA shader to force the
stride to be aligned, to allow using more efficient loads.
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp