]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: larger SRAM reads for tile FA, AMD FP16 dot (llama/15927)
authorJohannes Gäßler <redacted>
Thu, 11 Sep 2025 19:19:58 +0000 (21:19 +0200)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:33:50 +0000 (13:33 +0300)
commitb6b3db5189eda85ab052d22df4a9daafd846d2d1
treedc3200ae9abd893eefb213e60168819e0efd5085
parent9902604df05733b425a12d389acea0617b69f196
CUDA: larger SRAM reads for tile FA, AMD FP16 dot (llama/15927)

* CUDA: larger SRAM reads for tile FA, AMD FP16 dot

* fix logic for availability of v_dot2_f32_f16
src/ggml-cuda/common.cuh
src/ggml-cuda/fattn-tile.cu
src/ggml-cuda/vendors/hip.h