]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
CUDA: larger SRAM reads for tile FA, AMD FP16 dot (#15927)
authorJohannes Gäßler <redacted>
Thu, 11 Sep 2025 19:19:58 +0000 (21:19 +0200)
committerGitHub <redacted>
Thu, 11 Sep 2025 19:19:58 +0000 (21:19 +0200)
commit0e6ff0046f4a2983b2c77950aa75960fe4b4f0e2
tree7893487e7bbcaaf490b5f54629f4868951a367f0
parentdf082f56309073ecf885eceaa21b86e8a487e61b
CUDA: larger SRAM reads for tile FA, AMD FP16 dot (#15927)

* CUDA: larger SRAM reads for tile FA, AMD FP16 dot

* fix logic for availability of v_dot2_f32_f16
ggml/src/ggml-cuda/common.cuh
ggml/src/ggml-cuda/fattn-tile.cu
ggml/src/ggml-cuda/vendors/hip.h