]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: support SET_ROWS (llama/14587)
authorJeff Bolz <redacted>
Sat, 12 Jul 2025 10:12:26 +0000 (05:12 -0500)
committerGeorgi Gerganov <redacted>
Sat, 12 Jul 2025 13:05:00 +0000 (16:05 +0300)
commit3b7e745ab8d79d4cdacb62e90f4d8b62ae2dd69e
tree2b5f9c3f97b2028f478dd6a46046efe43242f3f1
parent5eb294f4ff6e53272306d10b461487188bb6487e
vulkan: support SET_ROWS (llama/14587)

* vulkan: support SET_ROWS

Add variants of the copy_to_quant shader that do the SET_ROWS operation.
Change these shaders to spread the work across the workgroup.
The memory access pattern is probably not great (one thread per quant block),
but should be fine for now.

* vulkan: optimize set_rows

Larger workgroups for non-quant types.
Set "norepeat" (there is manual repeat logic).
Use fastmod.
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/copy_to_quant.comp
src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp