]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
model: add support for qwen3vl series (#16780)
authorJJJYmmm <redacted>
Thu, 30 Oct 2025 15:19:14 +0000 (23:19 +0800)
committerGitHub <redacted>
Thu, 30 Oct 2025 15:19:14 +0000 (16:19 +0100)
commitd261223d24e97f2df50220e4a5b7f0adb69bba81
tree2f4e3204c844895f75af5d0b2b2039f71c8be427
parentdcca0d3ab840ebe9b2ccd4719033d408eeb758d7
model: add support for qwen3vl series (#16780)

* support qwen3vl series.

Co-authored-by: Thireus ☠ <redacted>
Co-authored-by: yairpatch <redacted>
Co-authored-by: LETS-BEE <redacted>
* bugfix: fix the arch check for qwen3vl-moe.

* use build_ffn

* optimize deepstack structure

* optimize deepstack feature saving

* Revert "optimize deepstack feature saving" for temporal fix

This reverts commit f321b9fdf13e59527408152e73b1071e19a87e71.

* code clean

* use fused qkv in clip

* clean up / rm is_deepstack_layers for simplification

* add test model

* move test model to "big" section

* fix imrope check

* remove trailing whitespace

* fix rope fail

* metal : add imrope support

* add imrope support for sycl

* vulkan: add imrope w/o check

* fix vulkan

* webgpu: add imrope w/o check

* Update gguf-py/gguf/tensor_mapping.py

Co-authored-by: Sigbjørn Skjæret <redacted>
* fix tensor mapping

---------

Co-authored-by: Thireus ☠ <redacted>
Co-authored-by: yairpatch <redacted>
Co-authored-by: LETS-BEE <redacted>
Co-authored-by: Xuan Son Nguyen <redacted>
Co-authored-by: Georgi Gerganov <redacted>
Co-authored-by: Sigbjørn Skjæret <redacted>
28 files changed:
convert_hf_to_gguf.py
ggml/include/ggml.h
ggml/src/ggml-cpu/ops.cpp
ggml/src/ggml-cuda/rope.cu
ggml/src/ggml-metal/ggml-metal-device.cpp
ggml/src/ggml-metal/ggml-metal-impl.h
ggml/src/ggml-metal/ggml-metal.metal
ggml/src/ggml-sycl/rope.cpp
ggml/src/ggml-vulkan/ggml-vulkan.cpp
ggml/src/ggml-vulkan/vulkan-shaders/rope_head.glsl
ggml/src/ggml-vulkan/vulkan-shaders/rope_multi.comp
ggml/src/ggml-webgpu/wgsl-shaders/rope.tmpl.wgsl
gguf-py/gguf/constants.py
gguf-py/gguf/gguf_writer.py
gguf-py/gguf/tensor_mapping.py
include/llama.h
src/llama-arch.cpp
src/llama-arch.h
src/llama-hparams.cpp
src/llama-hparams.h
src/llama-kv-cache.cpp
src/llama-model.cpp
tests/test-backend-ops.cpp
tests/test-rope.cpp
tools/mtmd/clip-impl.h
tools/mtmd/clip.cpp
tools/mtmd/mtmd.cpp
tools/mtmd/tests.sh