]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
authorJohannes Gäßler <redacted>
Sat, 1 Jun 2024 13:47:04 +0000 (15:47 +0200)
committerGitHub <redacted>
Sat, 1 Jun 2024 13:47:04 +0000 (15:47 +0200)
commit750f60c03e4d3f53fa51910551ce87a3d508d2d7
tree6fdbcc6e7824b1ebd7586de57fff3c76636a86da
parent9b596417af11c9ac44fcae0fcfbc6f3665089083
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (#7681)
ggml-cuda/fattn-common.cuh
ggml-cuda/fattn-tile-f16.cu
ggml-cuda/fattn-tile-f32.cu
ggml-cuda/fattn-vec-f16.cuh
ggml-cuda/fattn-vec-f32.cuh
ggml-cuda/fattn-wmma-f16.cuh
ggml-cuda/fattn.cu