]>
git.djapps.eu Git - pkg/ggml/sources/llama.cpp/commit
llava : Add Granite Vision Support (#11794)
* Add super wip scripts for multimodal granite gguf
Signed-off-by: Alex-Brooks <redacted>
* Add example for converting mmgranite to gguf
Signed-off-by: Alex-Brooks <redacted>
* remove hardcoded path
Signed-off-by: Alex-Brooks <redacted>
* Add vision feature layer to gguf params
Signed-off-by: Alex-Brooks <redacted>
* Clean up llava surgery and remove name substitution hacks
Signed-off-by: Alex-Brooks <redacted>
* Add transformers llava next tensor name mapping
Signed-off-by: Alex-Brooks <redacted>
* Make siglip / openclip mutuall exclusive
Signed-off-by: Alex-Brooks <redacted>
* Fix projector linear substitution
Signed-off-by: Alex-Brooks <redacted>
* Fix linear 2 substitution index
Signed-off-by: Alex-Brooks <redacted>
* Increase max flattened gridpoints to 64
Signed-off-by: Alex-Brooks <redacted>
* Fix hardcoded concat for multiple feature layers
Signed-off-by: Alex-Brooks <redacted>
* Pull vision feature layers out of gguf keys
Signed-off-by: Alex-Brooks <redacted>
* fix num gridpoints and use all layers
Signed-off-by: Alex-Brooks <redacted>
* Avoid dropping last image encoder layer in llava models
Signed-off-by: Alex-Brooks <redacted>
* Use 10 for max number of patches
Signed-off-by: Alex-Brooks <redacted>
* Standardize vision feature layers
Signed-off-by: Alex-Brooks <redacted>
* Cleanup logs
Signed-off-by: Alex-Brooks <redacted>
* Update comment for vision feature layer init
Signed-off-by: Alex-Brooks <redacted>
* Update notes for alternative to legacy llm conversion script
Signed-off-by: Alex-Brooks <redacted>
* Fix notes rendering
Signed-off-by: Alex-Brooks <redacted>
* Add v prefix to vision feature layer log
Signed-off-by: Alex-Brooks <redacted>
* Use current defaults for feature layer
Signed-off-by: Alex-Brooks <redacted>
* Use constant for max gridpoints / feat layers, style fixes
Signed-off-by: Alex-Brooks <redacted>
* clarify non-negative feature layers
Signed-off-by: Alex-Brooks <redacted>
* Remove CLIP_API from func signature
Signed-off-by: Alex-Brooks <redacted>
* USE MAX_IMAGE_FEATURE_LAYERS const in layer calc
Signed-off-by: Alex-Brooks <redacted>
* Clarify feature layers are non negative ints and not uint
Signed-off-by: Alex-Brooks <redacted>
* Fix condition for reading feature layers
Signed-off-by: Alex-Brooks <redacted>
* pop last llava layer when feature layers are unset
Signed-off-by: Alex-Brooks <redacted>
* Fix unset vision layer 0
Signed-off-by: Alex-Brooks <redacted>
* Update examples/llava/clip.cpp
Co-authored-by: Xuan-Son Nguyen <redacted>
* Reenable assertion for out of bounds get_rows
Signed-off-by: Alex-Brooks <redacted>
* Use std vector for gridpoints and feature layers
Signed-off-by: Alex-Brooks <redacted>
* Caculate max feature layer at load time
Signed-off-by: Alex-Brooks <redacted>
* Include base patch for granite vision allocation
Signed-off-by: Alex-Brooks <redacted>
* Fix trailing whitespace
Signed-off-by: Alex-Brooks <redacted>
* Add max num patches = 10 back for minicpmv
Signed-off-by: Alex-Brooks <redacted>
* Use unordered set to store feature layers
Co-authored-by: Xuan-Son Nguyen <redacted>
Signed-off-by: Alex-Brooks <redacted>
* Use max feature layer for postnorm
Signed-off-by: Alex-Brooks <redacted>
* Apply suggestions from code review
---------
Signed-off-by: Alex-Brooks <redacted>
Co-authored-by: Xuan-Son Nguyen <redacted>