]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
vulkan: support flash attention GQA/split_k with small batches (llama/18938)
authorJeff Bolz <redacted>
Wed, 21 Jan 2026 16:43:43 +0000 (10:43 -0600)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 13:56:40 +0000 (15:56 +0200)
commitb2bc4d810b2df64b33be6613fc76c7769fecf503
tree3807b4fb394f68f54c5a29d6697a9e534374523d
parent3bbf4ced474f5cd43dbbd827b6815ffb526ac33e
vulkan: support flash attention GQA/split_k with small batches (llama/18938)
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn.comp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_base.glsl
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm1.comp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp
ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_split_k_reduce.comp