]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: enable FA for FP32 KV cache (llama/16546)
authorJohannes Gäßler <redacted>
Tue, 14 Oct 2025 12:22:47 +0000 (14:22 +0200)
committerGeorgi Gerganov <redacted>
Tue, 14 Oct 2025 19:07:44 +0000 (22:07 +0300)
commita499133a5ebdae605902e587263dc5cb93f5b753
tree4718282f3bd6457befc91c2e81a5a4a37d29b34a
parentb8774a9fc2926adf15b37b37b05cb146f2acbf6c
CUDA: enable FA for FP32 KV cache (llama/16546)
src/ggml-cuda/fattn-vec.cuh
src/ggml-cuda/fattn.cu