git.djapps.eu Git - pkg/ggml/sources/ggml/commit

]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit

overview / pkg / ggml / sources / ggml / commit

author	uvos <redacted>
	Wed, 1 Apr 2026 08:21:20 +0000 (10:21 +0200)
committer	Georgi Gerganov <redacted>
	Wed, 1 Apr 2026 13:00:26 +0000 (16:00 +0300)
commit	e9dfca719d58f18d096fe22c241071917c2ee729
tree	d165ee8a6bef02756952e5cb08255345c828f4e7	tree
parent	9695adb06458e4feb283821818a287df35b38f31	commit \| diff

CUDA/HIP: Fix kernel slection for mmvq mmid kernel to align host selection with device launch bounds (llama/21238)

The conditions cc == GGML_CUDA_CC_VOLTA || cc >= GGML_CUDA_CC_ADA_LOVELACE and cc >= GGML_CUDA_CC_TURING match all non-nvidia devices. This causes us to attempt to launch the kernel for batch sizes with larger configurations than our launch bounds on HIP devices. This pr fixes the conditionals in get_mmvq_mmid_max_batch.

Fixes #21191

src/ggml-cuda/mmvq.cu

diff | blob | history

Packaging of ggml-org/ggml

RSS Atom