]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: faster tile FA (Pascal/AMD), headsize 256 (llama/15769)
authorJohannes Gäßler <redacted>
Sat, 6 Sep 2025 22:26:28 +0000 (00:26 +0200)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:33:50 +0000 (13:33 +0300)
commitec7bd5c950b34e47df2b52d901fc283d81b95679
tree2631a29aebcd0dfd728a767ec0692eea83b70a33
parentfcecf7b0b137536a403d3d601cfd298be4520822
CUDA: faster tile FA (Pascal/AMD), headsize 256 (llama/15769)
src/ggml-cuda/fattn-tile-f16.cu [deleted file]
src/ggml-cuda/fattn-tile-f16.cuh [deleted file]
src/ggml-cuda/fattn-tile-f32.cu [deleted file]
src/ggml-cuda/fattn-tile-f32.cuh [deleted file]
src/ggml-cuda/fattn-tile.cu [new file with mode: 0644]
src/ggml-cuda/fattn-tile.cuh [new file with mode: 0644]
src/ggml-cuda/fattn.cu