]> git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
mtmd : add vision support for llama 4 (#13282)
authorXuan-Son Nguyen <redacted>
Mon, 19 May 2025 11:04:14 +0000 (13:04 +0200)
committerGitHub <redacted>
Mon, 19 May 2025 11:04:14 +0000 (13:04 +0200)
commit92ecdcc06a4c405a415bcaa0cb772bc560aa23b1
tree0886c219b56c662cf072a25a881c0a43864b28e1
parentf71f40a2847d4c9f57b86cd206e0a27b2bfb6d1c
mtmd : add vision support for llama 4 (#13282)

* wip llama 4 conversion

* rm redundant __init__

* fix conversion

* fix conversion

* test impl

* try this

* reshape patch_embeddings_0

* fix view

* rm ffn_post_norm

* cgraph ok

* f32 for pos embd

* add image marker tokens

* Llama4UnfoldConvolution

* correct pixel shuffle

* fix merge conflicts

* correct

* add debug_graph

* logits matched, but it still preceives the image incorrectly

* fix style

* add image_grid_pinpoints

* handle llama 4 preprocessing

* rm load_image_size

* rm unused line

* fix

* small fix 2

* add test & docs

* fix llava-1.6 test

* test: add notion of huge models

* add comment

* add warn about degraded quality
convert_hf_to_gguf.py
docs/multimodal.md
gguf-py/gguf/constants.py
gguf-py/gguf/tensor_mapping.py
tools/mtmd/clip-impl.h
tools/mtmd/clip.cpp
tools/mtmd/clip.h
tools/mtmd/mtmd.cpp
tools/mtmd/tests.sh