]> git.djapps.eu Git - pkg/ggml/sources/ggml/commit
vulkan: sort graph to allow more parallel execution (llama/15850)
authorJeff Bolz <redacted>
Mon, 8 Sep 2025 18:10:07 +0000 (13:10 -0500)
committerGeorgi Gerganov <redacted>
Sat, 20 Sep 2025 10:33:50 +0000 (13:33 +0300)
commitb14a96b635b1ef93c3456555161b8d12063400b7
tree61839ba647efd75c423a801f877316470757c8f5
parent6b874b33d17b9c0aa3559c7d8207ae093c65ff0d
vulkan: sort graph to allow more parallel execution (llama/15850)

* vulkan: sort graph to allow more parallel execution

Add a backend proc to allow the backend to modify the graph. The
vulkan implementation looks at which nodes depend on each other
and greedily reorders them to group together nodes that don't
depend on each other. It only reorders the nodes, doesn't change
the contents of any of them.

With #15489, this reduces the number of synchronizations needed.

* call optimize_graph per-split
13 files changed:
src/ggml-backend-impl.h
src/ggml-backend.cpp
src/ggml-blas/ggml-blas.cpp
src/ggml-cann/ggml-cann.cpp
src/ggml-cpu/ggml-cpu.cpp
src/ggml-cuda/ggml-cuda.cu
src/ggml-metal/ggml-metal.m
src/ggml-opencl/ggml-opencl.cpp
src/ggml-rpc/ggml-rpc.cpp
src/ggml-sycl/ggml-sycl.cpp
src/ggml-vulkan/ggml-vulkan.cpp
src/ggml-webgpu/ggml-webgpu.cpp
src/ggml-zdnn/ggml-zdnn.cpp