]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: faster tile FA, add oob checks, more HSs (#16492)
authorJohannes Gäßler <redacted>
Sat, 11 Oct 2025 18:54:32 +0000 (20:54 +0200)
committerGitHub <redacted>
Sat, 11 Oct 2025 18:54:32 +0000 (20:54 +0200)
commit11f0af5504252e453d57406a935480c909e3ff37
tree5a525cccfad79b85e51cb88b518349750aabf1ca
parenta3cb04744fb5c591985f53b749fef5407d07a145
CUDA: faster tile FA, add oob checks, more HSs (#16492)
18 files changed:
ggml/src/ggml-cuda/CMakeLists.txt
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-tile.cu
ggml/src/ggml-cuda/fattn-tile.cuh
ggml/src/ggml-cuda/fattn-wmma-f16.cuh
ggml/src/ggml-cuda/fattn.cu
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq112-dv112.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq128-dv128.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq256-dv256.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq40-dv40.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq576-dv512.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq64-dv64.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq80-dv80.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq96-dv96.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/generate_cu_files.py
ggml/src/ggml-hip/CMakeLists.txt
ggml/src/ggml-musa/CMakeLists.txt