]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
vulkan: improve partial offloading performance on AMD (llama/19976)
authorRuben Ortlam <redacted>
Sun, 1 Mar 2026 16:32:14 +0000 (17:32 +0100)
committerGeorgi Gerganov <redacted>
Mon, 16 Mar 2026 11:10:15 +0000 (13:10 +0200)
commit2a9649c4205215cfc8ca6e296f7299fd25153bf2
tree076da55a8518c961fef73d694037475df0f1634f
parentca3f6bbd3cbf796e1966b0893ea248d456a83592
vulkan: improve partial offloading performance on AMD (llama/19976)

* vulkan: fix and enable cpy_tensor_async function

* use transfer_queue for async transfers on AMD, synchronize with timeline semaphore

* update offload_op logic

* fix missing transfer submission

* disable async transfer queue on AMD GCN

* revert op batch size change

* fix cpy_tensor_async checks
ggml/src/ggml-vulkan/ggml-vulkan.cpp