]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (llama/19281)
authorJeff Bolz <redacted>
Thu, 5 Feb 2026 15:26:38 +0000 (09:26 -0600)
committerGeorgi Gerganov <redacted>
Sat, 7 Feb 2026 08:37:38 +0000 (10:37 +0200)
commit8ab9ddbf6b051daaf8d6cf248c98b4c2d14316eb
tree43830d9187bb6ca19ea7eaeb0cf43bd73dfeb6c9
parent51f911d35c8d04ca5a98575dc56c488fc12dfaec
vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (llama/19281)

Write out a 2-bit code per block and avoid loading the mask when it
matches these two common cases.

Apply this optimization when the mask is relatively large (i.e. prompt
processing).
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/flash_attn.comp
src/ggml-vulkan/vulkan-shaders/flash_attn_base.glsl
src/ggml-vulkan/vulkan-shaders/flash_attn_cm1.comp
src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp
src/ggml-vulkan/vulkan-shaders/flash_attn_mask_opt.comp [new file with mode: 0644]
src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp
tests/test-backend-ops.cpp