git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	matteo <redacted>
	Sun, 29 Jun 2025 18:02:53 +0000 (20:02 +0200)
committer	GitHub <redacted>
	Sun, 29 Jun 2025 18:02:53 +0000 (20:02 +0200)
commit	caf5681fcb47dfe9bafee94ef9aa8f669ac986c7
tree	803f1dac071f96e86a7625bfd615be009d5dd346	tree
parent	83790b0e7e09ab17238b16452a33053a71dbdfad	commit \| diff

server : support jinja extra template kwargs (Qwen3 enable_thinking feature), from command line and from client (#13196)

* initial commit for handling extra template kwargs

* enable_thinking and assistant prefill cannot be enabled at the same time

* can set chat_template_kwargs in command line

* added doc

* fixed formatting

* add support for extra context in generic template init

* coding standard: common/chat.cpp

Co-authored-by: Georgi Gerganov <redacted>
* coding standard: common/chat.cpp

Co-authored-by: Georgi Gerganov <redacted>
* Apply suggestions from code review

coding standard: cosmetic changes

Co-authored-by: Georgi Gerganov <redacted>
* fix merge conflict

* chat.cpp: simplify calls to apply to ensure systematic propagation of extra_context (+ the odd existing additional_context)

* normalize environment variable name

* simplify code

* prefill cannot be used with thinking models

* compatibility with the new reasoning-budget parameter

* fix prefill for non thinking models

---------

Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: Olivier Chafik <redacted>

common/arg.cpp		diff \| blob \| history
common/chat.cpp		diff \| blob \| history
common/chat.h		diff \| blob \| history
common/common.h		diff \| blob \| history
tools/server/README.md		diff \| blob \| history
tools/server/server.cpp		diff \| blob \| history
tools/server/utils.hpp		diff \| blob \| history