]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: fix padding of GQA to power of 2 in FA (#19115)
authorJohannes Gäßler <redacted>
Mon, 26 Jan 2026 22:24:58 +0000 (23:24 +0100)
committerGitHub <redacted>
Mon, 26 Jan 2026 22:24:58 +0000 (23:24 +0100)
commitb0311c16d2f650a8bd5af652549075b458bd713a
treed7d385f05760a6baac6c8f419e372c9f636972e5
parent8f80d1b254aef70a0959e314be368d05debe7294
CUDA: fix padding of GQA to power of 2 in FA (#19115)
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-mma-f16.cuh
tests/test-backend-ops.cpp