From: Oleksandr Kuvshynov Date: Mon, 6 Oct 2025 07:53:31 +0000 (-0400) Subject: server: update readme to mention n_past_max metric (#16436) X-Git-Tag: upstream/0.0.6764~71 X-Git-Url: https://git.djapps.eu/?a=commitdiff_plain;h=c5fef0fcea3b978a2318b1af170209ecec7c37b4;p=pkg%2Fggml%2Fsources%2Fllama.cpp server: update readme to mention n_past_max metric (#16436) https://github.com/ggml-org/llama.cpp/pull/15361 added new metric exported, but I've missed this doc. --- diff --git a/tools/server/README.md b/tools/server/README.md index 9f7ab229..6825c8bf 100644 --- a/tools/server/README.md +++ b/tools/server/README.md @@ -1045,6 +1045,7 @@ Available metrics: - `llamacpp:kv_cache_tokens`: KV-cache tokens. - `llamacpp:requests_processing`: Number of requests processing. - `llamacpp:requests_deferred`: Number of requests deferred. +- `llamacpp:n_past_max`: High watermark of the context size observed. ### POST `/slots/{id_slot}?action=save`: Save the prompt cache of the specified slot to a file.