]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
server: add auto-sleep after N seconds of idle (#18228)
authorXuan-Son Nguyen <redacted>
Sun, 21 Dec 2025 01:24:42 +0000 (02:24 +0100)
committerGitHub <redacted>
Sun, 21 Dec 2025 01:24:42 +0000 (02:24 +0100)
commitddcb75dd8ac42dc23eb84f13bb17670fe9f2d49b
tree5dc68358bb9ccba1bd4f16e89892d177e5561beb
parent52ab19df633f3de5d4db171a16f2d9edd2342fec
server: add auto-sleep after N seconds of idle (#18228)

* implement sleeping at queue level

* implement server-context suspend

* add test

* add docs

* optimization: add fast path

* make sure to free llama_init

* nits

* fix use-after-free

* allow /models to be accessed during sleeping, fix use-after-free

* don't allow accessing /models during sleep, it is not thread-safe

* fix data race on accessing props and model_meta

* small clean up

* trailing whitespace

* rm outdated comments
12 files changed:
common/arg.cpp
common/common.h
tools/cli/cli.cpp
tools/server/README-dev.md
tools/server/README.md
tools/server/server-context.cpp
tools/server/server-context.h
tools/server/server-queue.cpp
tools/server/server-queue.h
tools/server/server.cpp
tools/server/tests/unit/test_sleep.py [new file with mode: 0644]
tools/server/tests/utils.py