]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: deduplicate FlashAttention code (llama/7352)
authorJohannes Gäßler <redacted>
Sat, 18 May 2024 10:36:25 +0000 (12:36 +0200)
committerGeorgi Gerganov <redacted>
Sun, 16 Jun 2024 15:19:48 +0000 (18:19 +0300)
commit45b5b95e2942bdfc76956ffe67435dd38f1e6b4a
treea5bbf2a7b9b615d0dbd73c8bb2189f49cb79ee2d
parentf2c47d1e6a023626db02c52907efd20d2b7960ef
CUDA: deduplicate FlashAttention code (llama/7352)
ggml-cuda/common.cuh
ggml-cuda/fattn-common.cuh
ggml-cuda/fattn-tile-f16.cu
ggml-cuda/fattn-tile-f32.cu
ggml-cuda/fattn.cu
ggml-cuda/softmax.cu