]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: fix padding of GQA to power of 2 in FA (llama/19115)
authorJohannes Gäßler <redacted>
Mon, 26 Jan 2026 22:24:58 +0000 (23:24 +0100)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 11:49:29 +0000 (13:49 +0200)
commit43e6c668ee9be3d1a9a264ab36942528219a6681
treec3ccb986e09ea5c896e4b4d029fc156d5fba3e52
parenta82d61d27660f6ce7d25aa12484de1847379bec2
CUDA: fix padding of GQA to power of 2 in FA (llama/19115)
src/ggml-cuda/fattn-common.cuh
src/ggml-cuda/fattn-mma-f16.cuh
tests/test-backend-ops.cpp