]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: optimize and reenable split_k (llama/10637)
authorJeff Bolz <redacted>
Tue, 3 Dec 2024 19:29:54 +0000 (13:29 -0600)
committerGeorgi Gerganov <redacted>
Thu, 5 Dec 2024 12:27:39 +0000 (14:27 +0200)
commiteae88f1e3e026a2f810b82425b06f957a3caaaa7
tree1e175e42fcf92d1a77b92ef01a4ee0b1099f5988
parent74d66b63eaf207a24f3e93bb922aba131cbf2906
vulkan: optimize and reenable split_k (llama/10637)

Use vector loads when possible in mul_mat_split_k_reduce. Use split_k
when there aren't enough workgroups to fill the shaders.
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-vulkan/vulkan-shaders/mul_mat_split_k_reduce.comp