]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: extend topk_moe to handle sigmoid w/exp_probs_b for nemotron (llama/18295)
authorJeff Bolz <redacted>
Thu, 1 Jan 2026 07:58:27 +0000 (01:58 -0600)
committerGeorgi Gerganov <redacted>
Sun, 11 Jan 2026 09:02:08 +0000 (11:02 +0200)
commit1fae2224fbc4f784487c3ecf1af4a91ef1157264
treedc27f5a3cdd6c678f5ab08e395ef578d6395430b
parentebc3a0f4a56be1c9424a89fbec09962ac34fde85
vulkan: extend topk_moe to handle sigmoid w/exp_probs_b for nemotron (llama/18295)

* vulkan: extend topk_moe to handle sigmoid w/exp_probs_b for nemotron

Also handle GGML_OP_SCALE at the end (nemotron, deepseek2).

Fewer pipeline variants and spec constants, just use push constants.

In test_topk_moe, change exp_probs_b to be 1D, matching real networks.

Update test-backend-ops and ggml-backend to allow verifying multiple outputs
in a fusion test (topk_moe has two outputs). Previously only the final node
was verified.

* change test_topk_moe to allow results in arbitrary order

* disable sigmoid fusion for moltenvk
include/ggml-backend.h
src/ggml-backend.cpp
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/topk_moe.comp
tests/test-backend-ops.cpp