git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

overview / pkg / ggml / sources / llama.cpp / commit

author	Oleksandr Kuvshynov <redacted>
	Sun, 17 Aug 2025 22:28:58 +0000 (18:28 -0400)
committer	GitHub <redacted>
	Sun, 17 Aug 2025 22:28:58 +0000 (00:28 +0200)
commit	e5155e698645242d4f019267ecc40ea9bad81b09
tree	e483220d69f49c76ba7255b19e83f4d4019c138a	tree
parent	21c17b5befc5f6be5992bc87fc1ba99d388561df	commit \| diff

server : export max observed n_past value (#15361)

Add tracking for high watermark cache usage and make it available in /metrics endpoint.

Use-case: Tracking largest needed cache usage under realistic workload
to better understand memory requirements and be able to adjust
cache size/quantization for model/cache accordingly.

tools/server/server.cpp

diff | blob | history

Packaging of ggml-org/llama.cpp

RSS Atom