]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: faster large batch FA without tensor cores (llama/7314)
authorJohannes Gäßler <redacted>
Fri, 17 May 2024 16:54:52 +0000 (18:54 +0200)
committerGeorgi Gerganov <redacted>
Tue, 28 May 2024 11:41:08 +0000 (14:41 +0300)
commit6cf2e5d6fd399e925d961d6b42f85f6485ae7fb0
tree817c22389841715af85dd58145e5bc290d336c1e
parent930690634abf61f8187de0c9d2bcd9f6eeb3e1c8
CUDA: faster large batch FA without tensor cores (llama/7314)
src/ggml-cuda/fattn-tile-f16.cu [new file with mode: 0644]
src/ggml-cuda/fattn-tile-f16.cuh [new file with mode: 0644]
src/ggml-cuda/fattn-tile-f32.cu [new file with mode: 0644]
src/ggml-cuda/fattn-tile-f32.cuh [new file with mode: 0644]
src/ggml-cuda/fattn-vec-f16.cu
src/ggml-cuda/fattn-vec-f32.cu
src/ggml-cuda/fattn.cu