]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: attention sinks for mma FlashAttention (#15157)
authorJohannes Gäßler <redacted>
Fri, 8 Aug 2025 06:19:58 +0000 (08:19 +0200)
committerGitHub <redacted>
Fri, 8 Aug 2025 06:19:58 +0000 (08:19 +0200)
commit1425f587a82bc303469b5c32759a2746ba4e1e20
treee9dc1fc1b17f1748fac23ed90108dcfff22bbb23
parentaaa3d07ae749b781d6135eaff23c7fa8a4ab404a
CUDA: attention sinks for mma FlashAttention (#15157)
ggml/src/ggml-cuda/fattn-mma-f16.cuh
ggml/src/ggml-cuda/fattn.cu
ggml/src/ggml-cuda/ggml-cuda.cu