]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)
authorbssrdf <redacted>
Mon, 23 Mar 2026 00:06:30 +0000 (20:06 -0400)
committerGitHub <redacted>
Mon, 23 Mar 2026 00:06:30 +0000 (01:06 +0100)
commitec2b787ebe975cab1fa592c2505a6d3a9e4ff2e7
tree8db5f162e30ae32651c2e24bdad2ae369eb94324
parentd3ac030a5d1edfbb7f7150126f6b9e9bf7c00c26
mtmd: Add dynamic high-resolution image preprocessing for InternVL model (#20847)

* added support for internvl's dynamic high-resolution (Qianfan-OCR needed)

* add min/max dynamic patch to gguf meta

* clean up

* simplified handling min/max dynamic patch

* reuse llava_uhd logic for slice images

* provide default values for older models

* flake8

* prevent writing 0 value to gguf

* remove duplicated resolution candidates with a better algorithm

* fix indentation

* format

* add protection from divide by zero

* change to 0 to be safe

---------

Co-authored-by: Xuan Son Nguyen <redacted>
convert_hf_to_gguf.py
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
tools/mtmd/clip-impl.h
tools/mtmd/clip-model.h
tools/mtmd/clip.cpp
tools/mtmd/mtmd.cpp