]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
CUDA: MoE helper in device code, better tile sizes (llama/15525)
authorJohannes Gäßler <redacted>
Mon, 25 Aug 2025 15:23:40 +0000 (17:23 +0200)
committerGeorgi Gerganov <redacted>
Fri, 5 Sep 2025 09:54:03 +0000 (12:54 +0300)
commit06034b3c24212e9eb85e90d2079dafb444a148d2
tree47eaa01c4c4c1823fa42108a4225bedd840b4486
parent1eb4bc716c1b89b1a211d0cf00c96c920aff5168
CUDA: MoE helper in device code, better tile sizes (llama/15525)

* CUDA: MoE helper in device code, better tile sizes

* reduce superfluous CUDA blocks
src/ggml-cuda/common.cuh
src/ggml-cuda/mmq.cu
src/ggml-cuda/mmq.cuh
src/ggml-cuda/vendors/hip.h