]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5...
authorhksdpc255 <redacted>
Tue, 18 Nov 2025 17:54:15 +0000 (04:54 +1100)
committerGitHub <redacted>
Tue, 18 Nov 2025 17:54:15 +0000 (18:54 +0100)
commit1920345c3bcec451421bb6abc4981678cc721154
treeef3f9e0ab4242bc9ca0ef3097dd6f61408649bf7
parent561a3e2788b6310f5ac6691d1b2a91546887191e
common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932)

* Add files via upload

* fix unit test

* fix crashes for --reasoning-format=none

* Patch buggy official MiniMax-M2 chat template

* add upstream minja fix: https://github.com/ochafik/minja/pull/7

* Fix <think> token not generated

* add test copied from https://github.com/ggml-org/llama.cpp/pull/16946

* cleanup

* Hopes to fix the compilation error on CI

* Delete chat template patching since it’s fixed by upstream Minja

* Remove undeeded Minimax-M2 template patch

https://github.com/ochafik/minja/pull/7#issuecomment-3480356100

* Add proper handling of optional parameters with test
merged tests from: https://github.com/ggml-org/llama.cpp/pull/16946/commits/23d4bb75c485c12ac89f81c424dc03c87a640e8c

* Fix making all tool parameters optional

* Move xml tool parser to separate file

* cleanup & add tests for GLM4.5

* add streaming tests & enhancement & cleanups

Add streaming test for both GLM 4.5 and minimax-m2.
Cleanup for preserved_tokens.
Cleanup for grammar rule name.
Enhance the parser's stability.

* cleanup & add support for Kimi-K2 Qwen3-Coder Apriel-1.5 Xiaomi-MiMo

* apply suggestions from reviewers

* fix a misuse for data.grammar_lazy

* fix grammar when tool have no argument

* Fix `no triggers set for lazy grammar!` for GLM4.5/4.6. Insert additional stops for Kimi-K2

* update chat.cpp

* fix grammar for GLM 4.5/4.6

* Try fix Jinja template for GLM

* Try fix GLM-4.6.jinja

* Update common/chat-parser-xml-toolcall.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* improve chat template for GLM, rename Kimi-K2 template to Kimi-K2-Thinking

* Improve Kimi-K2 chat template

* Fix unit test

* Fix "Invalid tool call arguments passed" in a rare case.

In a rare case, the model may emit a raw string that begins with a valid JSON string. This commit adds unit tests to cover that scenario and fixes the regression introduced during the Kimi-K2 adaptation.

---------

Co-authored-by: Sigbjørn Skjæret <redacted>
17 files changed:
common/CMakeLists.txt
common/chat-parser-xml-toolcall.cpp [new file with mode: 0644]
common/chat-parser-xml-toolcall.h [new file with mode: 0644]
common/chat-parser.h
common/chat.cpp
common/chat.h
common/json-partial.cpp
common/json-schema-to-grammar.cpp
common/json-schema-to-grammar.h
models/templates/GLM-4.6.jinja [new file with mode: 0644]
models/templates/Kimi-K2-Instruct.jinja [new file with mode: 0644]
models/templates/Kimi-K2-Thinking.jinja [new file with mode: 0644]
models/templates/MiMo-VL.jinja [new file with mode: 0644]
models/templates/MiniMax-M2.jinja [new file with mode: 0644]
models/templates/Qwen3-Coder.jinja [new file with mode: 0644]
models/templates/unsloth-Apriel-1.5.jinja [new file with mode: 0644]
tests/test-chat.cpp