]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: use mma FA kernel for gqa > 4 on RTX 4000 (llama/15035)
authorJohannes Gäßler <redacted>
Sat, 2 Aug 2025 14:37:08 +0000 (16:37 +0200)
committerGeorgi Gerganov <redacted>
Thu, 14 Aug 2025 11:17:28 +0000 (14:17 +0300)
commitbe4f47db5c3bd12e6da18f0a3cc8d69441fb2f40
tree2924674bf4bd42619207bb8ab6d8cc6024d45ba1
parent7dee1d6a1e7611f238d09be96738388da97c88ed
CUDA: use mma FA kernel for gqa > 4 on RTX 4000 (llama/15035)
src/ggml-cuda/fattn.cu