]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
vulkan: use aligned loads for flash attention mask (llama/12853)
authorJeff Bolz <redacted>
Sat, 12 Apr 2025 08:44:48 +0000 (03:44 -0500)
committerGeorgi Gerganov <redacted>
Thu, 24 Apr 2025 17:39:16 +0000 (20:39 +0300)
commit751e42b21eb2edc005d2027d48cba12c3361f6ba
treebd175c31326abfada379e2e812140ca41cad33d2
parente8ee32d12d9605c8978ba6ceac7a65a6a94940ce
vulkan: use aligned loads for flash attention mask (llama/12853)

Rewrite the stride logic for the mask tensor in the FA shader to force the
stride to be aligned, to allow using more efficient loads.
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp