]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: Update topk_moe fusion to handle gpt's late softmax (llama/16656)
authorJeff Bolz <redacted>
Wed, 29 Oct 2025 13:44:29 +0000 (08:44 -0500)
committerGeorgi Gerganov <redacted>
Sat, 1 Nov 2025 07:41:35 +0000 (09:41 +0200)
commit54ab43b107aff151df65f60525383f01632f6483
tree3e2a46ef421ebe0302dc792252110d111e18c760
parent558148d7422b9a6d867fd5dfae8e9974c212dc57
vulkan: Update topk_moe fusion to handle gpt's late softmax (llama/16656)

* vulkan: Update topk_moe fusion to handle gpt's late softmax

Based on #16649.

* Add ggml_check_edges

* Add sync logging to show fusion effects

* handle clamp added in #16655

* Update ggml/src/ggml-impl.h

Co-authored-by: Diego Devesa <redacted>
src/ggml-impl.h
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/topk_moe.comp