]> git.djapps.eu Git - pkg/ggml/sources/whisper.cpp/commit
SYCL: Avoid using with SYCL-Graph for unsupported nodes (llama/13587)
authorEwan Crawford <redacted>
Thu, 22 May 2025 08:24:09 +0000 (09:24 +0100)
committerGeorgi Gerganov <redacted>
Tue, 27 May 2025 15:03:00 +0000 (18:03 +0300)
commit730a00be8a067ad65b73fa978314049e2a29165f
tree5cf12aebe127921a27ee262c68f7c5a51a740667
parent316600e8eed30c2da6d124f18311039562b56f22
SYCL: Avoid using with SYCL-Graph for unsupported nodes (llama/13587)

Currently on a CUDA backend to SYCL when running
`GGML_SYCL_DISABLE_GRAPH=0 ./bin/test-backend-ops -b SYCL0` there
are two operations that throw an exception from the blocking
waits during queue recording.

* `-o CONCAT` : Use of blocking waits on a queue that's being recorded https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/concat.cpp#L185-L187
* `-o MUL_MAT_ID`: Blocking wait on a recording queue for a copy to host memory https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/ggml-sycl.cpp#L3072-L3074

We've noticed that `ggml-cuda.cu` has the
[check_node_graph_compatibility_and_refresh_copy_ops](https://github.com/ggml-org/llama.cpp/blob/39e73ae0d69f882d7e29cecc6dd8f5052fca6731/ggml/src/ggml-cuda/ggml-cuda.cu#L2458-L2458)
method for checking if a graph can be used, even if enabled. I've taken a
similar approach in this PR by adding a method to `ggml-sycl.cpp` for checking
if a graph can be used for the operations even if a user has asked for it to be
enabled.
ggml/src/ggml-sycl/ggml-sycl.cpp