]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: faster large batch FA without tensor cores (#7314)
authorJohannes Gäßler <redacted>
Fri, 17 May 2024 16:54:52 +0000 (18:54 +0200)
committerGitHub <redacted>
Fri, 17 May 2024 16:54:52 +0000 (18:54 +0200)
commit0fc1e820a9900a3dd08ddd3c6abe6604c53b689b
tree0a9a50311831d1343e301a55f49a7127a9d3df38
parent82ca83db3c8d45df559c03a4225b6eb34808a2db
CUDA: faster large batch FA without tensor cores (#7314)
ggml-cuda/fattn-tile-f16.cu [new file with mode: 0644]
ggml-cuda/fattn-tile-f16.cuh [new file with mode: 0644]
ggml-cuda/fattn-tile-f32.cu [new file with mode: 0644]
ggml-cuda/fattn-tile-f32.cuh [new file with mode: 0644]
ggml-cuda/fattn-vec-f16.cu
ggml-cuda/fattn-vec-f32.cu
ggml-cuda/fattn.cu