]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: faster tile FA, add oob checks, more HSs (llama/16492)
authorJohannes Gäßler <redacted>
Sat, 11 Oct 2025 18:54:32 +0000 (20:54 +0200)
committerGeorgi Gerganov <redacted>
Tue, 14 Oct 2025 19:07:44 +0000 (22:07 +0300)
commit0b56d427d54efe1797180de4f6db4b2d62e09ae7
tree3a6d8b49bfbb286744d7da81fe8d02aa0f327122
parentfcc2a5c0cfd81ee0517ee42f1acdc371ec92d598
CUDA: faster tile FA, add oob checks, more HSs (llama/16492)
18 files changed:
src/ggml-cuda/CMakeLists.txt
src/ggml-cuda/common.cuh
src/ggml-cuda/fattn-common.cuh
src/ggml-cuda/fattn-tile.cu
src/ggml-cuda/fattn-tile.cuh
src/ggml-cuda/fattn-wmma-f16.cuh
src/ggml-cuda/fattn.cu
src/ggml-cuda/template-instances/fattn-tile-instance-dkq112-dv112.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/fattn-tile-instance-dkq128-dv128.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/fattn-tile-instance-dkq256-dv256.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/fattn-tile-instance-dkq40-dv40.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/fattn-tile-instance-dkq576-dv512.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/fattn-tile-instance-dkq64-dv64.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/fattn-tile-instance-dkq80-dv80.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/fattn-tile-instance-dkq96-dv96.cu [new file with mode: 0644]
src/ggml-cuda/template-instances/generate_cu_files.py
src/ggml-hip/CMakeLists.txt
src/ggml-musa/CMakeLists.txt