]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
chat : Deepseek V3.1 reasoning and tool calling support (OpenAI Style) (#15533)
authorJesse <redacted>
Mon, 8 Sep 2025 14:59:48 +0000 (10:59 -0400)
committerGitHub <redacted>
Mon, 8 Sep 2025 14:59:48 +0000 (16:59 +0200)
commit88021565f08e0b7c4e07ac089a15ec16fae9166c
tree1300f012d879118328170dafcabb9f4c121821d7
parent56920f56651908d5cce7a310dabf54ac4f6fbb7f
chat : Deepseek V3.1 reasoning and tool calling support (OpenAI Style) (#15533)

* Add DeepSeek V3.1 thinking mode support

- Added COMMON_CHAT_FORMAT_DEEPSEEK_V3_1 enum value
- Created common_chat_params_init_deepseek_v3_1() function (currently uses R1 implementation)
- Created common_chat_parse_deepseek_v3_1() function that handles V3.1 thinking format:
  - Extracts reasoning content before '</think>' tag into reasoning_content
  - Extracts regular content after '</think>' tag into content
  - No opening '<think>' tag in V3.1 format
- Added detection logic for V3.1 templates based on pattern: 'message['prefix'] is defined and message['prefix'] and thinking'
- Added V3.1 case to parsing switch statement

This addresses the issue where V3.1 outputs reasoning content followed by '</think>' and then regular content without the opening '<think>' tag.

* Another attempt by V3.1 non-thinking

* Fix test, but it's not asserting anything.

* Ignore vim swap files in tests dir

* Update the test

* Try using try_find_literal instead of regex

* passing test

* Revert "Try using try_find_literal instead of regex"

This reverts commit c50d887ec2780dd9e6b8b397e92347d3db8d5575.

* Remove unnecessary change

* Remove comment

* Add code to handle non-thinking mode.

* Try to set message['prefix'] when thinking is enabled.

* This fixes reasoning, but breaks normal content. We need state in the
chat parser.

* DeepSeek V3.1 thinking is now the default. Disable with `--reasoning-budget 0`.

* Simplify (DeepSeek V3.1 reasoning)

* Fix sign inversion bug

* Add some tool calling code (not working).

* Tool calls working in non-reasoning mode.

* Attempt a unit test for tool call parsing.

* Passing test

* Add tests for both happy path and broken fenced DeepSeek V3.1 tool call variants.

* Passing DeepSeek V3.1 tool call tests, but model is not working.

* Revert assistance response prefill change. Not my monkeys.

* Add fenced_thinking unit test variant. Passes, but thinking tool calling
still isn't working for some reason.

* Tests pass in reasoning mode. Also e2e tool test passes.

* Make a copy of the parse_json_tool_calls function for deepseek-v3.1 so
as to not accidentally introduce regressions.

* Fix thinking_forced_open logic. tool calling broken. Need to add another
test case.

* That's what I get for cargo culting a newline.

* Add multi tool call test for deepseek v3.1 non-reasoning

* Move test, remove .gitignore change

* Place deepseek-v3.1 reasoning test directly into existing reasoning
function per CISC's request.

* Address whitespace CI failure.

* Merge two assert_equals per CISC's request.

* Add DeepSeek-V3.1 tests to tests/test-chat.cpp per CISC's request.

* Merge deepseek V3.1 and regular parse_json_tool_calls() function
behaviors by adding optional update_cursor argument.

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* DeepSeek V3.1 fix reasoning_format none

* Strip grammar down to strictly what we expect based on model card. Throw
out parts we cargo culted from R1 that don't make sense.

* Update tests/test-chat-parser.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* DeepSeek V3.1 - Add edge case where thinking is forced open, there is
tool calling in the reasoning content, but then the model just stops the
output without closing the </think> tag, so it's not a partial. In this
case, use the tool call in the reasoning content.

* DeepSeek V3.1 - simplify update_cursor

* Update common/chat.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update common/chat.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Update common/chat.cpp

Co-authored-by: Sigbjørn Skjæret <redacted>
* Fix indent

---------

Co-authored-by: openhands <redacted>
Co-authored-by: Sigbjørn Skjæret <redacted>
common/chat.cpp
common/chat.h
models/templates/README.md
models/templates/deepseek-ai-DeepSeek-V3.1.jinja [new file with mode: 0644]
tests/test-chat-parser.cpp
tests/test-chat.cpp