]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
server: improve slots scheduling for n_cmpl (#18789)
authorXuan-Son Nguyen <redacted>
Thu, 15 Jan 2026 16:10:28 +0000 (17:10 +0100)
committerGitHub <redacted>
Thu, 15 Jan 2026 16:10:28 +0000 (17:10 +0100)
commita04c2b06a324cc9c7e09de4106597a86eb2421c5
treef60c66eecf6abdfd940b8ea5431d4b390da94d0c
parent39173bcacb67329850b9ff3108dd036eafb680f0
server: improve slots scheduling for n_cmpl (#18789)

* server : make sure children tasks are scheduled to launch with parent

* fix

* add comment pointing to this PR

* fix

* clean up

* more debug messages

* add pop_deferred_task with specific ID version

* improve the logic

* simple approach

* no double move

* correct return type of launch_slots_with_parent_task
tools/server/server-context.cpp
tools/server/server-queue.cpp
tools/server/server-queue.h
tools/server/server-task.h
tools/server/tests/unit/test_chat_completion.py