]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: attention sinks for mma FlashAttention (llama/15157)
authorJohannes Gäßler <redacted>
Fri, 8 Aug 2025 06:19:58 +0000 (08:19 +0200)
committerGeorgi Gerganov <redacted>
Mon, 18 Aug 2025 17:30:45 +0000 (20:30 +0300)
commit2baea5e4b328f74de657a53da2f8e7ddd39e7c37
tree8df16915e80764d1cc273cd1185486b547703d59
parent8a36cd924a9a5af7bd168d538f306fefd1b77f6c
CUDA: attention sinks for mma FlashAttention (llama/15157)
ggml/src/ggml-cuda/fattn-mma-f16.cuh
ggml/src/ggml-cuda/fattn.cu
ggml/src/ggml-cuda/ggml-cuda.cu