]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
rpc : reuse compute graph buffers (#21299)
authorRadoslav Gerganov <redacted>
Fri, 3 Apr 2026 07:28:09 +0000 (10:28 +0300)
committerGitHub <redacted>
Fri, 3 Apr 2026 07:28:09 +0000 (10:28 +0300)
commit0c58ba3365d2bc717b447b5d70e4d6be09ff3c40
treeeee0783d8460edc1e34736a6638e71c0102ea3dd
parent57ace0d612a11133ac86edcc7af1b323bf05f12f
rpc : reuse compute graph buffers (#21299)

Reuse the buffer for the ggml context which is used for creating the
compute graph on the server side. This partially addresses a memory leak
created by the CUDA backend due to using buffer addresses as cache
keys.

ref: #21265
ref: #20315
ggml/src/ggml-rpc/ggml-rpc.cpp