git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	uvos <redacted>
	Wed, 1 Apr 2026 08:21:20 +0000 (10:21 +0200)
committer	GitHub <redacted>
	Wed, 1 Apr 2026 08:21:20 +0000 (10:21 +0200)
commit	88d5f8ffc398f9d6dbe529e3f0f0c739eabeafa7
tree	ecd3cb8ccec8cae9ed3e59b46c2b1809218005ee	tree
parent	d43375ff7f73e5098837c20512aa58f4bc8edb02	commit \| diff

CUDA/HIP: Fix kernel slection for mmvq mmid kernel to align host selection with device launch bounds (#21238)

The conditions cc == GGML_CUDA_CC_VOLTA || cc >= GGML_CUDA_CC_ADA_LOVELACE and cc >= GGML_CUDA_CC_TURING match all non-nvidia devices. This causes us to attempt to launch the kernel for batch sizes with larger configurations than our launch bounds on HIP devices. This pr fixes the conditionals in get_mmvq_mmid_max_batch.

Fixes #21191

ggml/src/ggml-cuda/mmvq.cu

diff | blob | history

Packaging of ggml-org/llama.cpp

RSS Atom