]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (#12943)
authorRadoslav Gerganov <redacted>
Fri, 25 Apr 2025 07:08:08 +0000 (10:08 +0300)
committerGitHub <redacted>
Fri, 25 Apr 2025 07:08:08 +0000 (10:08 +0300)
commit553a5c3a9fdf771be2101bc3529937963f817457
tree658c55b2798e65ee845ee97925666cbf5b4ea918
parent13be08daf992c89d5169518229b3740041c0f419
rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (#12943)

RPC_CMD_SET_TENSOR always returns an empty response and we send this 4
times per token. We can improve TG speed if we don't wait for this empty
response.

The performance impact of this change depends on the network latency.
ggml/include/ggml-rpc.h
ggml/src/ggml-rpc/ggml-rpc.cpp