git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Ruben Ortlam <redacted>
	Mon, 1 Sep 2025 14:19:07 +0000 (16:19 +0200)
committer	GitHub <redacted>
	Mon, 1 Sep 2025 14:19:07 +0000 (16:19 +0200)
commit	02c1813517412f3e00aa6ca7c0273fea64edb492
tree	99bd74128e8ba8249ebbf2eef44a2c9ac26e76b6	tree
parent	77dee9de97be75b7143a213bc48893e0c0b29af7	commit \| diff

Vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants (#14903)

* vulkan: Add Integer Dot Product mul_mat_vec shader for legacy quants

* vulkan: use subgroup operations for quantize_q8_1 shader

* vulkan: add q8_1_x4 type with 128-bit alignment, use in mul_mat_vecq shader

* vulkan: use q8_1_x4 blocks in mul_mmq shader

* vulkan: do 8 calculations per invocation instead of 32 in mul_mat_vecq, similar to mul_mat_vec

* vulkan: tune mul_mat_vecq performance for Intel

* vulkan: fix quantizing issue when tensor is not divisible by 128

* vulkan: adapt integer dot mmv to mmv small m optimization (#15355)

* vulkan: allow all subgroup modes for mmv and mmvq

* vulkan: use prealloc intermediate reuse for mmvq path

* vulkan: tune mmvq for Intel, AMD GCN and Nvidia RTX 3090

* vulkan: adapt mmv quantize_y path to conditional sync logic

* vulkan: disable q8_0 mmvq on Nvidia

* vulkan: enable q8_0 on Nvidia pre-turing

* fix prealloc sync condition

* fix llvmpipe subgroup 8 issue

ggml/src/ggml-vulkan/ggml-vulkan.cpp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vec_base.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/mul_mat_vecq.comp	[new file with mode: 0644]	blob
ggml/src/ggml-vulkan/vulkan-shaders/mul_mmq.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/mul_mmq_funcs.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/quantize_q8_1.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/types.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp		diff \| blob \| history