]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: fix padding of GQA to power of 2 in FA (llama/19115)
authorJohannes Gäßler <redacted>
Mon, 26 Jan 2026 22:24:58 +0000 (23:24 +0100)
committerGeorgi Gerganov <redacted>
Fri, 30 Jan 2026 13:56:40 +0000 (15:56 +0200)
commit41d5d7bb0efda27c65ad302c71a0f185cb93fbf0
tree033bccdf5a5434a695e07687489802a0b70f14a7
parentf63848eada9a8a1c1a0ab52c389a15e189e33c58
CUDA: fix padding of GQA to power of 2 in FA (llama/19115)
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-mma-f16.cuh