]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
server: support multiple generations from one prompt (OAI "n" option) (#17775)
authorXuan-Son Nguyen <redacted>
Sat, 6 Dec 2025 14:54:38 +0000 (15:54 +0100)
committerGitHub <redacted>
Sat, 6 Dec 2025 14:54:38 +0000 (15:54 +0100)
commitc42712b056fb2cf03902ef57fd314c531e356965
treed17e51b77cef0768cefd17bbf7bdf8f956ec8e92
parent09c7c50e64c98adac452d090406b6e5f6c320a41
server: support multiple generations from one prompt (OAI "n" option) (#17775)

* backend support

* server: support multiple generations from one prompt (OAI "n" option)

* fix invalid batch

* format oai

* clean up

* disable ctx shift

* add test

* update comments

* fix style

* add n_cmpl to docs [no ci]

* allowing using both n_cmpl and n
tools/server/README.md
tools/server/server-common.cpp
tools/server/server-common.h
tools/server/server-context.cpp
tools/server/server-task.cpp
tools/server/server-task.h
tools/server/tests/unit/test_chat_completion.py