]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
server : implement prompt processing progress report in stream mode (#15827)
authorXuan-Son Nguyen <redacted>
Sat, 6 Sep 2025 11:35:04 +0000 (18:35 +0700)
committerGitHub <redacted>
Sat, 6 Sep 2025 11:35:04 +0000 (13:35 +0200)
commit61bdfd5298a78593be649a1035ee2a120b13c4f0
tree33c3c6f82f32ead09a89bd0d2a17d0570224fdc3
parent01806e77714ae8a78130d432945b959a0956c56f
server : implement prompt processing progress report in stream mode (#15827)

* server : implement `return_progress`

* add timings.cache_n

* add progress.time_ms

* add test

* fix test for chat/completions

* readme: add docs on timings

* use ggml_time_us

Co-authored-by: Georgi Gerganov <redacted>
---------

Co-authored-by: Georgi Gerganov <redacted>
tools/server/README.md
tools/server/server.cpp
tools/server/tests/unit/test_chat_completion.py