]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
server: fix n_cmpl not skipping processing prompt (#18663)
authorXuan-Son Nguyen <redacted>
Fri, 9 Jan 2026 23:00:41 +0000 (00:00 +0100)
committerGitHub <redacted>
Fri, 9 Jan 2026 23:00:41 +0000 (00:00 +0100)
commit9ac2693a302adab4b5b44fb1fef52c9e6d14a770
tree4500fa9dfc3d86f96359f3331150ee84b4223180
parenta61c8bc3bfae4f86b8205535bcea73f476b28c2c
server: fix n_cmpl not skipping processing prompt (#18663)

* server: fix n_cmpl not skipping processing

* fix infinite loop on empty batch

* cont : init child samplers + modify child logic

* cont : cleanup

* cont : improve n_cmpl logic

- launch the parent task first so it finds the slot with best cache
- parent task waits for child tasks to be launched
- when a child task finishes - remove its cache

* cont : remove redundant function

* cont : reduce parent checks

* fix : nullptr task dereference

---------

Co-authored-by: Georgi Gerganov <redacted>
tools/server/server-context.cpp
tools/server/server-task.h