]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
mmq.cu: tune mmq/rocblas switching for RDNA (llama/18537)
authorBeinsezii <redacted>
Tue, 6 Jan 2026 15:26:07 +0000 (07:26 -0800)
committerGeorgi Gerganov <redacted>
Sun, 11 Jan 2026 09:02:08 +0000 (11:02 +0200)
commit803885a3d5444518b2919f669bbb311dbde92cb9
treef151a3d4572efaead4192ffac558a930d58a8283
parentf8256829e07c7b3035dd43d892175551ca746482
mmq.cu: tune mmq/rocblas switching for RDNA (llama/18537)

* Patch perf regression for mmq kernels in ROCm

recover performance regression for https://github.com/ggml-org/llama.cpp/issues/17917

* add n_experts branch like the cdna path

* mmq.cu: tune mmq/wmma switching for RDNA

* mmq.cu: move amd wmma mmq/wmma switching behind IS_RDNA3

* Update ggml/src/ggml-cuda/mmq.cu

Co-authored-by: Johannes Gäßler <redacted>
---------

Co-authored-by: Jiacheng (Jason) Chen <redacted>
Co-authored-by: jiachengjason <redacted>
Co-authored-by: Johannes Gäßler <redacted>
src/ggml-cuda/mmq.cu