git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit

author	Tarek Dakhran <redacted>
	Thu, 19 Feb 2026 11:18:57 +0000 (12:18 +0100)
committer	GitHub <redacted>
	Thu, 19 Feb 2026 11:18:57 +0000 (12:18 +0100)
commit	c5897995a726dc9ebdafd91d4cd552b95f4ac199
tree	7b79e0bf2929e4f7c617bc28dd31133283839108	tree
parent	03fd9d3bb43f7d5132ded9d7a47740c07cffc76d	commit \| diff

mtmd : chat : Fix extra \n between text and media marker (#19595)

* mtmd : chat : Fix extra \n between text and media marker

Thanks to @tugot17 for detecting and reporting the issue.

For vision models (e.g. LFM2.5-VL-1.6B and Qwen/Qwen3-VL-4B-Instruct) `llama-mtmd-cli` produces identical output to HF implementation.

However `llama-server` doesn't. I traced it down to extra newline
inserted after `<__media__>`.

This happens in `to_json_oaicompat`, that treats media markers as text
and joins all parts with `\n` separator.

PR introduces new type `media_marker` and uses it for media markers.
Extra logic is added to prevent insertion of newlines before and after
media markers.

With this change number of input tokens is identical to HF
implementation and as a result the output is also identical.

I explored other ways to address the issue
* remove completely `\n` between text parts in `to_json_oaicompat`
* merge text messages in server-common.cpp before sending them to `to_json_oaicompat`

Please propose alternative ways of fixing this issue.

* Refactor to use explicite per type ifs

* Update common/chat.cpp

Co-authored-by: Piotr Wilkin (ilintar) <redacted>
* Update common_chat_templates_apply_legacy

---------

Co-authored-by: Piotr Wilkin (ilintar) <redacted>

common/chat.cpp		diff \| blob \| history
tools/server/server-common.cpp		diff \| blob \| history