]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: Implement topk_moe fused shader, ported from CUDA (llama/16641)
authorJeff Bolz <redacted>
Sat, 18 Oct 2025 10:22:57 +0000 (05:22 -0500)
committerGeorgi Gerganov <redacted>
Tue, 21 Oct 2025 15:14:33 +0000 (18:14 +0300)
commitc48d0af1d915f980f03aef2fbb8e8ee569b50756
tree5e3bfb5f4091f69b0ff7e5052a7bedbaeae1e993
parent1fe7675fa66ca8c6930f93b54c3b40c8d964276b
vulkan: Implement topk_moe fused shader, ported from CUDA (llama/16641)

This is similar to the CUDA shader from #16130, but doesn't use shared memory
and handles different subgroup sizes.
src/ggml-impl.h
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/topk_moe.comp [new file with mode: 0644]
src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp