]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
rpc : cache and reuse compute graphs (#15405)
authorRadoslav Gerganov <redacted>
Fri, 28 Nov 2025 08:33:51 +0000 (10:33 +0200)
committerGitHub <redacted>
Fri, 28 Nov 2025 08:33:51 +0000 (08:33 +0000)
commit15d2b46b4dec7747ca50c2f950f6f482ea0d6198
treeea3ca9542ee8f06b105a2a855c4a3c8101a1d75c
parent6bca76ff5ea5c7efe9c62e60852d88d350403d58
rpc : cache and reuse compute graphs (#15405)

Store the last computed graph and reuse it when possible.
Also do not return response from GRAPH_COMPUTE and assume it always
completes successfully. If this this is not the case, the server closes
the connection. This saves us a network round trip to the server.
ggml/include/ggml-rpc.h
ggml/src/ggml-rpc/ggml-rpc.cpp