git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit

author	Jeff Bolz <redacted>
	Fri, 2 Jan 2026 21:32:30 +0000 (15:32 -0600)
committer	Georgi Gerganov <redacted>
	Wed, 14 Jan 2026 07:11:59 +0000 (09:11 +0200)
commit	9d83865607bc4112bb41f6d7c544c3874651859d
tree	2e283d4995a3603ad16a59f5340fcb6fb6435fd4	tree
parent	b7ff521e71bba72c0d0f2108433fdbb3b14c196e	commit \| diff

vulkan: Optimize GGML_OP_CUMSUM (llama/18417)

* vulkan: Optimize GGML_OP_CUMSUM

There are two paths: The preexisting one that does a whole row per workgroup
in a single shader, and one that splits each row into multiple blocks and does
two passes. The first pass computes partials within a block, the second adds
the block partials to compute the final result. The multipass shader is used
when there are a small number of large rows.

In the whole-row shader, handle multiple elements per invocation.

* use 2 ELEM_PER_THREAD for AMD/Intel

* address feedback

ggml/src/ggml-vulkan/ggml-vulkan.cpp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/cumsum.comp		diff \| blob \| history
ggml/src/ggml-vulkan/vulkan-shaders/cumsum_multipass1.comp	[new file with mode: 0644]	blob
ggml/src/ggml-vulkan/vulkan-shaders/cumsum_multipass2.comp	[new file with mode: 0644]	blob
ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp		diff \| blob \| history