]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: faster tile FA, add oob checks, more HSs (llama/16492)
authorJohannes Gäßler <redacted>
Sat, 11 Oct 2025 18:54:32 +0000 (20:54 +0200)
committerGeorgi Gerganov <redacted>
Wed, 15 Oct 2025 06:29:17 +0000 (09:29 +0300)
commitb5fb9b9f58ea65d0e367d1183dd328283aecee66
tree6e049232b90906e635176f38052b58dc8afdb157
parenta91dd3be72f70dd1b3cb6e252f35fa17b93f596c
CUDA: faster tile FA, add oob checks, more HSs (llama/16492)
18 files changed:
ggml/src/ggml-cuda/CMakeLists.txt
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-common.cuh
ggml/src/ggml-cuda/fattn-tile.cu
ggml/src/ggml-cuda/fattn-tile.cuh
ggml/src/ggml-cuda/fattn-wmma-f16.cuh
ggml/src/ggml-cuda/fattn.cu
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq112-dv112.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq128-dv128.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq256-dv256.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq40-dv40.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq576-dv512.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq64-dv64.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq80-dv80.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/fattn-tile-instance-dkq96-dv96.cu [new file with mode: 0644]
ggml/src/ggml-cuda/template-instances/generate_cu_files.py
ggml/src/ggml-hip/CMakeLists.txt
ggml/src/ggml-musa/CMakeLists.txt