From: Oleksandr Kuvshynov <redacted>
Date: Mon, 6 Oct 2025 07:53:31 +0000 (-0400)
Subject: server: update readme to mention n_past_max metric (#16436)
X-Git-Tag: upstream/0.0.6764~71
X-Git-Url: https://git.djapps.eu/?a=commitdiff_plain;h=c5fef0fcea3b978a2318b1af170209ecec7c37b4;p=pkg%2Fggml%2Fsources%2Fllama.cpp

server: update readme to mention n_past_max metric (#16436)

https://github.com/ggml-org/llama.cpp/pull/15361 added new metric
exported, but I've missed this doc.
---

diff --git a/tools/server/README.md b/tools/server/README.md
index 9f7ab229f..6825c8bf3 100644
--- a/tools/server/README.md
+++ b/tools/server/README.md
@@ -1045,6 +1045,7 @@ Available metrics:
 - `llamacpp:kv_cache_tokens`: KV-cache tokens.
 - `llamacpp:requests_processing`: Number of requests processing.
 - `llamacpp:requests_deferred`: Number of requests deferred.
+- `llamacpp:n_past_max`: High watermark of the context size observed.
 
 ### POST `/slots/{id_slot}?action=save`: Save the prompt cache of the specified slot to a file.