]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: faster tile FA (Pascal/AMD), headsize 256 (#15769)
authorJohannes Gäßler <redacted>
Sat, 6 Sep 2025 22:26:28 +0000 (00:26 +0200)
committerGitHub <redacted>
Sat, 6 Sep 2025 22:26:28 +0000 (00:26 +0200)
commit79bc429262268ad2ac8a364cfe6c2d6b9c5f008a
tree835dcbc3277e1cec25e038f5711bbbc4d3f8a6ed
parentc4df49a42d396bdf7344501813e7de53bc9e7bb3
CUDA: faster tile FA (Pascal/AMD), headsize 256 (#15769)
ggml/src/ggml-cuda/fattn-tile-f16.cu [deleted file]
ggml/src/ggml-cuda/fattn-tile-f16.cuh [deleted file]
ggml/src/ggml-cuda/fattn-tile-f32.cu [deleted file]
ggml/src/ggml-cuda/fattn-tile-f32.cuh [deleted file]
ggml/src/ggml-cuda/fattn-tile.cu [new file with mode: 0644]
ggml/src/ggml-cuda/fattn-tile.cuh [new file with mode: 0644]
ggml/src/ggml-cuda/fattn.cu