]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: fix FlashAttention on Turing (llama/13415)
authorJohannes Gäßler <redacted>
Sat, 10 May 2025 07:16:52 +0000 (09:16 +0200)
committerGeorgi Gerganov <redacted>
Tue, 13 May 2025 10:02:19 +0000 (13:02 +0300)
commit637981b2afd5c0d23eddc14799b314692c839453
treebc08197446e344e8dedffefd1eb350c284f00133
parent449571c3ce3f8c0fc5f7ee061e1db37ff9dce480
CUDA: fix FlashAttention on Turing (llama/13415)
src/ggml-cuda/fattn-mma-f16.cuh