]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: Implement set_tensor_async and the event interfaces (llama/18047)
authorJeff Bolz <redacted>
Sun, 21 Dec 2025 20:52:09 +0000 (14:52 -0600)
committerGeorgi Gerganov <redacted>
Wed, 31 Dec 2025 10:39:43 +0000 (12:39 +0200)
commit2ac2227ace9c0c388c76d15665b7a8e7a3800452
tree74d4d8c9f4b78ed5d02ba79c5e6251306748dd5a
parent68eab95c65563399ae90aafc04a33b78d6fa40a8
vulkan: Implement set_tensor_async and the event interfaces (llama/18047)

The goal is to enable the async loading code paths in
llama_model_loader::load_all_data, originally from #7896. This works and the
loads themselves are faster, but with host visible vidmem I think the cost of
allocating/mapping vidmem moves and becomes more expensive, and I don't see a
benefit by default. But with GGML_VK_DISABLE_HOST_VISIBLE_VIDMEM=1 I do see a
significant improvement in model loading time.
src/ggml-vulkan/ggml-vulkan.cpp