]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: support flash attention GQA/split_k with small batches (llama/18938)
authorJeff Bolz <redacted>
Wed, 21 Jan 2026 16:43:43 +0000 (10:43 -0600)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 11:49:29 +0000 (13:49 +0200)
commit541c0510c71cc4f852c91e62beddd2950a634a98
treeabe623c5647b3e672b7eb1d10dffc3186cb80cf9
parent6c0e0d9940c913f17ab13a71825e4157fba16d24
vulkan: support flash attention GQA/split_k with small batches (llama/18938)
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/flash_attn.comp
src/ggml-vulkan/vulkan-shaders/flash_attn_base.glsl
src/ggml-vulkan/vulkan-shaders/flash_attn_cm1.comp
src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp
src/ggml-vulkan/vulkan-shaders/flash_attn_split_k_reduce.comp
tests/test-backend-ops.cpp