]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)
authorJeff Bolz <redacted>
Thu, 5 Feb 2026 15:26:38 +0000 (09:26 -0600)
committerGitHub <redacted>
Thu, 5 Feb 2026 15:26:38 +0000 (09:26 -0600)
commit449ec2ab0751fc713fe338da2ced153125b5c674
treec54c02a6549bc770e1a2c747de5b6e321b1ec84b
parent3795cc1e89e16fbc145f8a5457ea30abd86e0d1d
vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)

Write out a 2-bit code per block and avoid loading the mask when it
matches these two common cases.

Apply this optimization when the mask is relatively large (i.e. prompt
processing).
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn.comp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_base.glsl
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm1.comp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_mask_opt.comp [new file with mode: 0644]
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp
tests/test-backend-ops.cpp