]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: Implement top-k (llama/17418)
authorJeff Bolz <redacted>
Wed, 26 Nov 2025 15:45:43 +0000 (09:45 -0600)
committerGeorgi Gerganov <redacted>
Thu, 11 Dec 2025 13:32:46 +0000 (15:32 +0200)
commitf6de24a9020f351865f6ecb2203ee9b0ca30fdce
treef3ad0ee758a4ef360aaac061dee3f22029171c39
parent27a858f51e4e014de786ac71ff5b5b843eb82396
vulkan: Implement top-k (llama/17418)

* vulkan: Implement top-k

Each pass launches workgroups that each sort 2^N elements (where N is usually 7-10)
and discards all but the top K. Repeat until only K are left. And there's a fast
path when K==1 to just find the max value rather than sorting.

* fix pipeline selection

* vulkan: Add N-ary search algorithm for topk

* microoptimizations
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/topk_argsort.comp [new file with mode: 0644]
src/ggml-vulkan/vulkan-shaders/topk_nary_search.comp [new file with mode: 0644]
src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp
tests/test-backend-ops.cpp