]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
CUDA: larger SRAM reads for tile FA, AMD FP16 dot (llama/15927)
authorJohannes Gäßler <redacted>
Thu, 11 Sep 2025 19:19:58 +0000 (21:19 +0200)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:45:28 +0000 (13:45 +0300)
commitf0768eb575f16d7eaa66c1fc507241f40d5081d2
tree3645668a95a9b4f157875f1725efeaf73c830a4c
parent020eb19eb3a2e1994fddde4b07de7d907c7d0286
CUDA: larger SRAM reads for tile FA, AMD FP16 dot (llama/15927)

* CUDA: larger SRAM reads for tile FA, AMD FP16 dot

* fix logic for availability of v_dot2_f32_f16
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-tile.cu
ggml/src/ggml-cuda/vendors/hip.h