]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
mmq.cu: tune mmq/rocblas switching for RDNA (#18537)
authorBeinsezii <redacted>
Tue, 6 Jan 2026 15:26:07 +0000 (07:26 -0800)
committerGitHub <redacted>
Tue, 6 Jan 2026 15:26:07 +0000 (16:26 +0100)
commit968929528c6a05e10249366fbe5f0330ad9af678
treeafe4f2e4a04cd571dce20748d3f0fe8b6301b628
parent3d26a09dc7b1a7c13da57fdd26d1cf22efa81229
mmq.cu: tune mmq/rocblas switching for RDNA (#18537)

* Patch perf regression for mmq kernels in ROCm

recover performance regression for https://github.com/ggml-org/llama.cpp/issues/17917

* add n_experts branch like the cdna path

* mmq.cu: tune mmq/wmma switching for RDNA

* mmq.cu: move amd wmma mmq/wmma switching behind IS_RDNA3

* Update ggml/src/ggml-cuda/mmq.cu

Co-authored-by: Johannes Gäßler <redacted>
---------

Co-authored-by: Jiacheng (Jason) Chen <redacted>
Co-authored-by: jiachengjason <redacted>
Co-authored-by: Johannes Gäßler <redacted>
ggml/src/ggml-cuda/mmq.cu