]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (llama/7681)
authorJohannes Gäßler <redacted>
Sat, 1 Jun 2024 13:47:04 +0000 (15:47 +0200)
committerGeorgi Gerganov <redacted>
Sat, 15 Jun 2024 19:05:47 +0000 (22:05 +0300)
commit655a5b4dddacb33a04adc978163f1799ba2b6e9c
tree47b6e0bc0c5ca872b481a88b243e83f741caff31
parentebe78772cd321ed1c969e58da951b0ff72b13110
CUDA: fix Pascal FA, deq. KV to FP16 for batch > 8 (llama/7681)
src/ggml-cuda/fattn-common.cuh
src/ggml-cuda/fattn-tile-f16.cu
src/ggml-cuda/fattn-tile-f32.cu
src/ggml-cuda/fattn-vec-f16.cuh
src/ggml-cuda/fattn-vec-f32.cuh
src/ggml-cuda/fattn-wmma-f16.cuh
src/ggml-cuda/fattn.cu