]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: use aligned loads for flash attention mask (llama/12853)
authorJeff Bolz <redacted>
Sat, 12 Apr 2025 08:44:48 +0000 (03:44 -0500)
committerGeorgi Gerganov <redacted>
Mon, 14 Apr 2025 07:35:15 +0000 (10:35 +0300)
commit37f901aa942d75fd8476fc5b46fc0d19e3e84934
treed6666da46cdd3142ba0231127e7484b889e6e77a
parent90208ef867a60e01d23af9f446ecd07c0f40d0e9
vulkan: use aligned loads for flash attention mask (llama/12853)

Rewrite the stride logic for the mask tensor in the FA shader to force the
stride to be aligned, to allow using more efficient loads.
src/ggml-vulkan/vulkan-shaders/flash_attn_cm2.comp